Composing Reinforcement Learning Policies, with Formal Guarantees

Delgrange, Florent; Avni, Guy; Lukina, Anna; Schilling, Christian; Nowé, Ann; Pérez, Guillermo A.

Computer Science > Artificial Intelligence

arXiv:2402.13785 (cs)

[Submitted on 21 Feb 2024 (v1), last revised 10 Mar 2025 (this version, v2)]

Title:Composing Reinforcement Learning Policies, with Formal Guarantees

Authors:Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowé, Guillermo A. Pérez

View PDF HTML (experimental)

Abstract:We propose a novel framework to controller design in environments with a two-level structure: a known high-level graph ("map") in which each vertex is populated by a Markov decision process, called a "room". The framework "separates concerns" by using different design techniques for low- and high-level tasks. We apply reactive synthesis for high-level tasks: given a specification as a logical formula over the high-level graph and a collection of low-level policies obtained together with "concise" latent structures, we construct a "planner" that selects which low-level policy to apply in each room. We develop a reinforcement learning procedure to train low-level policies on latent structures, which unlike previous approaches, circumvents a model distillation step. We pair the policy with probably approximately correct guarantees on its performance and on the abstraction quality, and lift these guarantees to the high-level task. These formal guarantees are the main advantage of the framework. Other advantages include scalability (rooms are large and their dynamics are unknown) and reusability of low-level policies. We demonstrate feasibility in challenging case studies where an agent navigates environments with moving obstacles and visual inputs.

Comments:	AAMAS 2025, 8 pages main text, 19 pages Appendix (excluding references)
Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.13785 [cs.AI]
	(or arXiv:2402.13785v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2402.13785

Submission history

From: Florent Delgrange [view email]
[v1] Wed, 21 Feb 2024 13:10:58 UTC (5,811 KB)
[v2] Mon, 10 Mar 2025 11:38:38 UTC (11,498 KB)

Computer Science > Artificial Intelligence

Title:Composing Reinforcement Learning Policies, with Formal Guarantees

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Composing Reinforcement Learning Policies, with Formal Guarantees

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators