Cyclophobic Reinforcement Learning

Wagner, Stefan Sylvius; Arndt, Peter; Robine, Jan; Harmeling, Stefan

Computer Science > Machine Learning

arXiv:2308.15911 (cs)

[Submitted on 30 Aug 2023]

Title:Cyclophobic Reinforcement Learning

Authors:Stefan Sylvius Wagner, Peter Arndt, Jan Robine, Stefan Harmeling

View PDF

Abstract:In environments with sparse rewards, finding a good inductive bias for exploration is crucial to the agent's success. However, there are two competing goals: novelty search and systematic exploration. While existing approaches such as curiosity-driven exploration find novelty, they sometimes do not systematically explore the whole state space, akin to depth-first-search vs breadth-first-search. In this paper, we propose a new intrinsic reward that is cyclophobic, i.e., it does not reward novelty, but punishes redundancy by avoiding cycles. Augmenting the cyclophobic intrinsic reward with a sequence of hierarchical representations based on the agent's cropped observations we are able to achieve excellent results in the MiniGrid and MiniHack environments. Both are particularly hard, as they require complex interactions with different objects in order to be solved. Detailed comparisons with previous approaches and thorough ablation studies show that our newly proposed cyclophobic reinforcement learning is more sample efficient than other state of the art methods in a variety of tasks.

Comments:	Published in Transactions on Machine Learning Research (08/2023)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Cite as:	arXiv:2308.15911 [cs.LG]
	(or arXiv:2308.15911v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2308.15911

Submission history

From: Stefan Sylvius Wagner [view email]
[v1] Wed, 30 Aug 2023 09:38:44 UTC (12,659 KB)

Computer Science > Machine Learning

Title:Cyclophobic Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Cyclophobic Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators