Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

Nauman, Michal; Bortkiewicz, Michał; Miłoś, Piotr; Trzciński, Tomasz; Ostaszewski, Mateusz; Cygan, Marek

Computer Science > Machine Learning

arXiv:2403.00514 (cs)

[Submitted on 1 Mar 2024 (v1), last revised 19 Jun 2024 (this version, v2)]

Title:Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

Authors:Michal Nauman, Michał Bortkiewicz, Piotr Miłoś, Tomasz Trzciński, Mateusz Ostaszewski, Marek Cygan

View PDF HTML (experimental)

Abstract:Recent advancements in off-policy Reinforcement Learning (RL) have significantly improved sample efficiency, primarily due to the incorporation of various forms of regularization that enable more gradient update steps than traditional agents. However, many of these techniques have been tested in limited settings, often on tasks from single simulation benchmarks and against well-known algorithms rather than a range of regularization approaches. This limits our understanding of the specific mechanisms driving RL improvements. To address this, we implemented over 60 different off-policy agents, each integrating established regularization techniques from recent state-of-the-art algorithms. We tested these agents across 14 diverse tasks from 2 simulation benchmarks, measuring training metrics related to overestimation, overfitting, and plasticity loss -- issues that motivate the examined regularization techniques. Our findings reveal that while the effectiveness of a specific regularization setup varies with the task, certain combinations consistently demonstrate robust and superior performance. Notably, a simple Soft Actor-Critic agent, appropriately regularized, reliably finds a better-performing policy within the training regime, which previously was achieved mainly through model-based approaches.

Comments:	ICML 2024
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2403.00514 [cs.LG]
	(or arXiv:2403.00514v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2403.00514

Submission history

From: Mateusz Ostaszewski [view email]
[v1] Fri, 1 Mar 2024 13:25:10 UTC (29,011 KB)
[v2] Wed, 19 Jun 2024 11:32:01 UTC (42,208 KB)

Computer Science > Machine Learning

Title:Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators