Variational Policy Propagation for Multi-agent Reinforcement Learning

Qu, Chao; Li, Hui; Liu, Chang; Xiong, Junwu; Zhang, James; Chu, Wei; Wang, Weiqiang; Qi, Yuan; Song, Le

Computer Science > Machine Learning

arXiv:2004.08883 (cs)

[Submitted on 19 Apr 2020 (v1), last revised 29 Jan 2022 (this version, v4)]

Title:Variational Policy Propagation for Multi-agent Reinforcement Learning

Authors:Chao Qu, Hui Li, Chang Liu, Junwu Xiong, James Zhang, Wei Chu, Weiqiang Wang, Yuan Qi, Le Song

View PDF

Abstract:We propose a \emph{collaborative} multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a \emph{joint} policy through the interactions over agents. We prove that the joint policy is a Markov Random Field under some mild conditions, which in turn reduces the policy space effectively. We integrate the variational inference as special differentiable layers in policy such that the actions can be efficiently sampled from the Markov Random Field and the overall policy is differentiable. We evaluate our algorithm on several large scale challenging tasks and demonstrate that it outperforms previous state-of-the-arts.

Comments:	The title of previous version was "Intention Propagation for Multi-agent Reinforcement Learning"
Subjects:	Machine Learning (cs.LG); Multiagent Systems (cs.MA); Machine Learning (stat.ML)
Cite as:	arXiv:2004.08883 [cs.LG]
	(or arXiv:2004.08883v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2004.08883

Submission history

From: Chao Qu [view email]
[v1] Sun, 19 Apr 2020 15:42:55 UTC (1,365 KB)
[v2] Sun, 16 Aug 2020 05:13:41 UTC (1,628 KB)
[v3] Mon, 18 Jan 2021 02:16:01 UTC (4,163 KB)
[v4] Sat, 29 Jan 2022 11:08:12 UTC (3,735 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-04

Change to browse by:

cs
cs.MA
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chao Qu
Hui Li
Chang Liu
Junwu Xiong
James Zhang

…

export BibTeX citation

Computer Science > Machine Learning

Title:Variational Policy Propagation for Multi-agent Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Variational Policy Propagation for Multi-agent Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators