Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Jonsson, Anders; Kaufmann, Emilie; Ménard, Pierre; Domingues, Omar Darwiche; Leurent, Edouard; Valko, Michal

Computer Science > Machine Learning

arXiv:2006.05879 (cs)

[Submitted on 10 Jun 2020]

Title:Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Authors:Anders Jonsson, Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko

View PDF

Abstract:We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process in which transitions have a finite support. We prove an upper bound on the number of calls to the generative models needed for MDP-GapE to identify a near-optimal action with high probability. This problem-dependent sample complexity result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration. Our experiments reveal that MDP-GapE is also effective in practice, in contrast with other algorithms with sample complexity guarantees in the fixed-confidence setting, that are mostly theoretical.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2006.05879 [cs.LG]
	(or arXiv:2006.05879v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2006.05879

Submission history

From: Edouard Leurent [view email]
[v1] Wed, 10 Jun 2020 15:05:51 UTC (1,282 KB)

Computer Science > Machine Learning

Title:Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators