Stick-Breaking Policy Learning in Dec-POMDPs

Liu, Miao; Amato, Christopher; Liao, Xuejun; Carin, Lawrence; How, Jonathan P.

Computer Science > Artificial Intelligence

arXiv:1505.00274v1 (cs)

[Submitted on 1 May 2015 (this version), latest version 23 Nov 2015 (v2)]

Title:Stick-Breaking Policy Learning in Dec-POMDPs

Authors:Miao Liu, Christopher Amato, Xuejun Liao, Lawrence Carin, Jonathan P. How

View PDF

Abstract:Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from optimal. This paper considers a variable-size FSC to represent the local policy of each agent. These variable-size FSCs are constructed using a stick-breaking prior, leading to a new framework called \emph{decentralized stick-breaking policy representation} (Dec-SBPR). This approach learns the controller parameters with a variational Bayesian algorithm without having to assume that the Dec-POMDP model is available. The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods.

Subjects:	Artificial Intelligence (cs.AI); Systems and Control (eess.SY); Machine Learning (stat.ML)
Cite as:	arXiv:1505.00274 [cs.AI]
	(or arXiv:1505.00274v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.1505.00274

Submission history

From: Miao Liu [view email]
[v1] Fri, 1 May 2015 20:29:27 UTC (638 KB)
[v2] Mon, 23 Nov 2015 20:48:32 UTC (638 KB)

Computer Science > Artificial Intelligence

Title:Stick-Breaking Policy Learning in Dec-POMDPs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Stick-Breaking Policy Learning in Dec-POMDPs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators