Proximal Deterministic Policy Gradient

Maggipinto, Marco; Susto, Gian Antonio; Chaudhari, Pratik

Computer Science > Machine Learning

arXiv:2008.00759 (cs)

[Submitted on 3 Aug 2020]

Title:Proximal Deterministic Policy Gradient

Authors:Marco Maggipinto, Gian Antonio Susto, Pratik Chaudhari

View PDF

Abstract:This paper introduces two simple techniques to improve off-policy Reinforcement Learning (RL) algorithms. First, we formulate off-policy RL as a stochastic proximal point iteration. The target network plays the role of the variable of optimization and the value network computes the proximal operator. Second, we exploits the two value functions commonly employed in state-of-the-art off-policy algorithms to provide an improved action value estimate through bootstrapping with limited increase of computational resources. Further, we demonstrate significant performance improvement over state-of-the-art algorithms on standard continuous-control RL benchmarks.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2008.00759 [cs.LG]
	(or arXiv:2008.00759v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2008.00759

Submission history

From: Marco Maggipinto [view email]
[v1] Mon, 3 Aug 2020 10:19:59 UTC (688 KB)

Full-text links:

Access Paper:

view license

Current browse context:

< prev | next >

new | recent | 2020-08

Change to browse by:

cs.LG
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Gian Antonio Susto
Pratik Chaudhari

export BibTeX citation

Computer Science > Machine Learning

Title:Proximal Deterministic Policy Gradient

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Proximal Deterministic Policy Gradient

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators