OptionZero: Planning with Learned Options

Huang, Po-Wei; Peng, Pei-Chiun; Guei, Hung; Wu, Ti-Rong

Computer Science > Artificial Intelligence

arXiv:2502.16634 (cs)

[Submitted on 23 Feb 2025 (v1), last revised 27 Feb 2025 (this version, v2)]

Title:OptionZero: Planning with Learned Options

Authors:Po-Wei Huang, Pei-Chiun Peng, Hung Guei, Ti-Rong Wu

View PDF

Abstract:Planning with options -- a sequence of primitive actions -- has been shown effective in reinforcement learning within complex environments. Previous studies have focused on planning with predefined options or learned options through expert demonstration data. Inspired by MuZero, which learns superhuman heuristics without any human knowledge, we propose a novel approach, named OptionZero. OptionZero incorporates an option network into MuZero, providing autonomous discovery of options through self-play games. Furthermore, we modify the dynamics network to provide environment transitions when using options, allowing searching deeper under the same simulation constraints. Empirical experiments conducted in 26 Atari games demonstrate that OptionZero outperforms MuZero, achieving a 131.58% improvement in mean human-normalized score. Our behavior analysis shows that OptionZero not only learns options but also acquires strategic skills tailored to different game characteristics. Our findings show promising directions for discovering and using options in planning. Our code is available at this https URL.

Comments:	Accepted by the Thirteenth International Conference on Learning Representations (ICLR 2025) as oral presentation
Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2502.16634 [cs.AI]
	(or arXiv:2502.16634v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2502.16634

Submission history

From: Po-Wei Huang [view email]
[v1] Sun, 23 Feb 2025 16:20:15 UTC (1,523 KB)
[v2] Thu, 27 Feb 2025 04:56:52 UTC (1,523 KB)

Computer Science > Artificial Intelligence

Title:OptionZero: Planning with Learned Options

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:OptionZero: Planning with Learned Options

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators