RORL: Robust Offline Reinforcement Learning via Conservative Smoothing

Yang, Rui; Bai, Chenjia; Ma, Xiaoteng; Wang, Zhaoran; Zhang, Chongjie; Han, Lei

Computer Science > Machine Learning

arXiv:2206.02829 (cs)

[Submitted on 6 Jun 2022 (v1), last revised 22 Oct 2022 (this version, v3)]

Title:RORL: Robust Offline Reinforcement Learning via Conservative Smoothing

Authors:Rui Yang, Chenjia Bai, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han

View PDF

Abstract:Offline reinforcement learning (RL) provides a promising direction to exploit massive amount of offline data for complex decision-making tasks. Due to the distribution shift issue, current offline RL algorithms are generally designed to be conservative in value estimation and action selection. However, such conservatism can impair the robustness of learned policies when encountering observation deviation under realistic conditions, such as sensor errors and adversarial attacks. To trade off robustness and conservatism, we propose Robust Offline Reinforcement Learning (RORL) with a novel conservative smoothing technique. In RORL, we explicitly introduce regularization on the policy and the value function for states near the dataset, as well as additional conservative value estimation on these states. Theoretically, we show RORL enjoys a tighter suboptimality bound than recent theoretical results in linear MDPs. We demonstrate that RORL can achieve state-of-the-art performance on the general offline RL benchmark and is considerably robust to adversarial observation perturbations.

Comments:	Accepted by Advances in Neural Information Processing Systems (NeurIPS) 2022
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2206.02829 [cs.LG]
	(or arXiv:2206.02829v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.02829

Submission history

From: Rui Yang [view email]
[v1] Mon, 6 Jun 2022 18:07:41 UTC (2,535 KB)
[v2] Mon, 26 Sep 2022 15:41:33 UTC (2,915 KB)
[v3] Sat, 22 Oct 2022 10:09:17 UTC (2,918 KB)

Computer Science > Machine Learning

Title:RORL: Robust Offline Reinforcement Learning via Conservative Smoothing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:RORL: Robust Offline Reinforcement Learning via Conservative Smoothing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators