Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning

Han, Xinchen; Afifi, Hossam; Marot, Michel

Computer Science > Machine Learning

arXiv:2501.08907 (cs)

[Submitted on 15 Jan 2025]

Title:Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning

Authors:Xinchen Han, Hossam Afifi, Michel Marot

View PDF HTML (experimental)

Abstract:Offline Reinforcement Learning (RL) faces a critical challenge of extrapolation errors caused by out-of-distribution (OOD) actions. Implicit Q-Learning (IQL) algorithm employs expectile regression to achieve in-sample learning, effectively mitigating the risks associated with OOD actions. However, the fixed hyperparameter in policy evaluation and density-based policy improvement method limit its overall efficiency. In this paper, we propose Proj-IQL, a projective IQL algorithm enhanced with the support constraint. In the policy evaluation phase, Proj-IQL generalizes the one-step approach to a multi-step approach through vector projection, while maintaining in-sample learning and expectile regression framework. In the policy improvement phase, Proj-IQL introduces support constraint that is more aligned with the policy evaluation approach. Furthermore, we theoretically demonstrate that Proj-IQL guarantees monotonic policy improvement and enjoys a progressively more rigorous criterion for superior actions. Empirical results demonstrate the Proj-IQL achieves state-of-the-art performance on D4RL benchmarks, especially in challenging navigation domains.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2501.08907 [cs.LG]
	(or arXiv:2501.08907v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2501.08907

Submission history

From: Xinchen Han [view email]
[v1] Wed, 15 Jan 2025 16:17:02 UTC (1,253 KB)

Computer Science > Machine Learning

Title:Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators