Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

Coppens, Youri; Steckelmacher, Denis; Jonker, Catholijn M.; Nowé, Ann

doi:10.1007/978-3-030-73959-1_15

Computer Science > Artificial Intelligence

arXiv:2106.06009 (cs)

[Submitted on 10 Jun 2021]

Title:Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

Authors:Youri Coppens, Denis Steckelmacher, Catholijn M. Jonker, Ann Nowé

View PDF

Abstract:Today's advanced Reinforcement Learning algorithms produce black-box policies, that are often difficult to interpret and trust for a person. We introduce a policy distilling algorithm, building on the CN2 rule mining algorithm, that distills the policy into a rule-based decision system. At the core of our approach is the fact that an RL process does not just learn a policy, a mapping from states to actions, but also produces extra meta-information, such as action values indicating the quality of alternative actions. This meta-information can indicate whether more than one action is near-optimal for a certain state. We extend CN2 to make it able to leverage knowledge about equally-good actions to distill the policy into fewer rules, increasing its interpretability by a person. Then, to ensure that the rules explain a valid, non-degenerate policy, we introduce a refinement algorithm that fine-tunes the rules to obtain good performance when executed in the environment. We demonstrate the applicability of our algorithm on the Mario AI benchmark, a complex task that requires modern reinforcement learning algorithms including neural networks. The explanations we produce capture the learned policy in only a few rules, that allow a person to understand what the black-box agent learned. Source code: this https URL

Comments:	17 pages, 4 figures. The final authenticated publication is available online at this https URL
Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2106.06009 [cs.AI]
	(or arXiv:2106.06009v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2106.06009
Journal reference:	Trustworthy AI - Integrating Learning, Optimization and Reasoning (2021), Lecture Notes in Computer Science, vol. 12641, pp. 163-179
Related DOI:	https://doi.org/10.1007/978-3-030-73959-1_15

Submission history

From: Youri Coppens [view email]
[v1] Thu, 10 Jun 2021 19:06:28 UTC (412 KB)

Computer Science > Artificial Intelligence

Title:Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators