Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning

Feinberg, Vladimir; Wan, Alvin; Stoica, Ion; Jordan, Michael I.; Gonzalez, Joseph E.; Levine, Sergey

Computer Science > Machine Learning

arXiv:1803.00101 (cs)

[Submitted on 28 Feb 2018]

Title:Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning

Authors:Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael I. Jordan, Joseph E. Gonzalez, Sergey Levine

View PDF

Abstract:Recent model-free reinforcement learning algorithms have proposed incorporating learned dynamics models as a source of additional data with the intention of reducing sample complexity. Such methods hold the promise of incorporating imagined data coupled with a notion of model uncertainty to accelerate the learning of continuous control tasks. Unfortunately, they rely on heuristics that limit usage of the dynamics model. We present model-based value expansion, which controls for uncertainty in the model by only allowing imagination to fixed depth. By enabling wider use of learned dynamics models within a model-free reinforcement learning algorithm, we improve value estimation, which, in turn, reduces the sample complexity of learning.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1803.00101 [cs.LG]
	(or arXiv:1803.00101v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1803.00101

Submission history

From: Vladimir Feinberg [view email]
[v1] Wed, 28 Feb 2018 21:43:37 UTC (4,856 KB)

Computer Science > Machine Learning

Title:Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators