An Empirical Study of Deep Reinforcement Learning in Continuing Tasks

Wan, Yi; Korenkevych, Dmytro; Zhu, Zheqing

Abstract:In reinforcement learning (RL), continuing tasks refer to tasks where the agent-environment interaction is ongoing and can not be broken down into episodes. These tasks are suitable when environment resets are unavailable, agent-controlled, or predefined but where all rewards-including those beyond resets-are critical. These scenarios frequently occur in real-world applications and can not be modeled by episodic tasks. While modern deep RL algorithms have been extensively studied and well understood in episodic tasks, their behavior in continuing tasks remains underexplored. To address this gap, we provide an empirical study of several well-known deep RL algorithms using a suite of continuing task testbeds based on Mujoco and Atari environments, highlighting several key insights concerning continuing tasks. Using these testbeds, we also investigate the effectiveness of a method for improving temporal-difference-based RL algorithms in continuing tasks by centering rewards, as introduced by Naik et al. (2024). While their work primarily focused on this method in conjunction with Q-learning, our results extend their findings by demonstrating that this method is effective across a broader range of algorithms, scales to larger tasks, and outperforms two other reward-centering approaches.

Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2501.06937 [cs.AI]
	(or arXiv:2501.06937v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2501.06937

Computer Science > Artificial Intelligence

Title:An Empirical Study of Deep Reinforcement Learning in Continuing Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators