没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
This paper use deep reinforcement learning to complish placement optimization. It is opportunity and challenge to physical design engineer. So I will do it from a interpreter.
资源推荐
资源详情
资源评论
Placement Optimization with Deep Reinforcement Learning
Anna Goldie and Azalia Mirhoseini
agoldie,azalia@google.com
Google Brain
ABSTRACT
Placement Optimization is an important problem in systems and
chip design, which consists of mapping the nodes of a graph onto
a limited set of resources to optimize for an objective, subject to
constraints. In this paper, we start by motivating reinforcement
learning as a solution to the placement problem. We then give an
overview of what deep reinforcement learning is. We next formu-
late the placement problem as a reinforcement learning problem,
and show how this problem can be solved with policy gradient
optimization. Finally, we describe lessons we have learned from
training deep reinforcement learning policies across a variety of
placement optimization problems.
KEYWORDS
Deep Learning, Reinforcement Learning, Placement Optimization,
Device Placement, RL for Combinatorial Optimization
ACM Reference Format:
Anna Goldie and Azalia Mirhoseini. 2020. Placement Optimization with
Deep Reinforcement Learning. In Proceedings of the 2020 International Sym-
posium on Physical Design (ISPD ’20), March 29-April 1, 2020, Taipei, Taiwan.
ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3372780.3378174
1 INTRODUCTION
An important problem in systems and chip design is Placement
Optimization, which refers to the problem of mapping the nodes of
a graph onto a limited set of resources to optimize for an objective,
subject to constraints. Common examples of this class of problem
include placement of TensorFlow graphs onto hardware devices to
minimize training or inference time, or placement of an ASIC or
FPGA netlist onto a grid to optimize for power, performance, and
area.
Placement is a very challenging problem as several factors, in-
cluding the size and topology of the input graph, number and
properties of available resources, and the requirements and con-
straints of feasible placements all contribute to its complexity. There
are many approaches to the placement problem. A range of algo-
rithms including analytical approaches [
3
,
12
,
14
,
15
], genetic and
hill-climbing methods [
4
,
6
,
13
], Integer Linear Programming (ILP)
[2, 27], and problem-specific heuristics have been proposed.
More recently, a new type of approach to the placement prob-
lem based on deep Reinforcement Learning (RL) [
16
,
17
,
28
] has
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
ISPD ’20, March 29-April 1, 2020, Taipei, Taiwan
© 2020 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-7091-2/20/03.
https://doi.org/10.1145/3372780.3378174
emerged. RL-based methods bring new challenges, such as inter-
pretability, brittleness of training to convergence, and unsafe ex-
ploration. However, they also offer new opportunities, such as the
ability to leverage distributed computing, ease of problem formu-
lation, end-to-end optimization, and domain adaptation, meaning
that these methods can potentially transfer what they learn from
previous problems to new unseen instances.
In this paper, we start by motivating reinforcement learning as
a solution to the placement problem. We then give an overview
of what deep reinforcement learning is. We then formulate the
placement problem as an RL problem, and show how this problem
can be solved with p olicy gradient optimization. Finally, we describe
lessons we have learned from training deep RL policies across a
variety of placement optimization problems.
2 DEEP REINFORCEMENT LEARNING
Most successful applications of machine learning are examples of
supervise d learning, where a model is trained to approximate a
particular function, given many input-output examples (e.g. given
many images labeled as cat or dog, learn to predict whether a given
image is that of a cat or a dog). Today’s state-of-the-art super-
vised models are typically deep learning models, meaning that the
function approximation is achieved by updating the weights of a
multi-layered (deep) neural network via gradient descent against a
differentiable loss function.
Reinforcement learning, on the other hand, is a separate branch
of machine learning in which a model, or policy in RL parlance,
learns to take actions in an environment (either the real world or a
simulation) to maximize a given reward function. One well-known
example of reinforcement learning is AlphaGo [
23
], in which a pol-
icy learned to take actions (moves in the game of Go) to maximize
its reward function (number of winning games). Deep reinforce-
ment learning is simply reinforcement learning in which the p olicy
is a deep neural network.
RL problems can be reformulated as Markov Decision Processes
(MDPs). MDPs rely on the Markov assumption, meaning that the
next state
𝑠
𝑡+1
depends only on the current state
𝑠
𝑡
, and is condi-
tionally independent of the past.
𝑃 (𝑠
𝑡+1
|𝑠
0
...𝑠
𝑡
) = 𝑃 (𝑠
𝑡+1
|𝑠
𝑡
)
Like MDPs, RL problems are defined by five key components:
•
states: the set of possible states of the world (e.g. the set of
valid board positions in Go)
•
actions: the set of actions that can be taken by the agent (e.g.
all valid moves in a game of Go)
•
state transition probabilities: the probability of transitioning
between any two given states.
资源评论
每天一个小脚印
- 粉丝: 634
- 资源: 7
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 【年度调薪】关于调岗调薪牢记十大要诀.doc
- 【年度调薪】“薪随岗变”有误区 考核标准解读.docx
- 【年度调薪】员工晋升调薪管理办法.doc
- 【年度调薪】员工调薪标准.doc
- 【年度调薪】全员调薪通知.docx
- 【年度调薪】年度公司调薪规划方案.docx
- 【年度调薪】用人单位单方调岗如何操作才有效?.docx
- 【年度调薪】员工调薪表.docx
- 【年度调薪】员工调薪管理规定.docx
- 【年度调薪】员工调薪管理规定(讨论稿).docx
- 【年度调薪】年度调薪实施方案.docx
- 【年度调薪】员工调薪管理办法 (2).docx
- 【年度调薪】员工调薪申请表.docx
- 欧姆龙 PLC 程序NJ ST语言EtherCat总线控制 24个伺服轴大型程序电池生产线 包括PLC NJ-1400和威纶通触摸屏程序 PLC通过EtherCat总线连接IS620N伺服 伺服轴已经
- 【年终奖】常见年终奖发放纠纷解析.docx
- 【年终奖】公司年终奖发放办法.docx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功