Counterfactual Explanations as Plans

Belle, Vaishak

doi:10.4204/EPTCS.416.14

Computer Science > Artificial Intelligence

arXiv:2502.09205 (cs)

[Submitted on 13 Feb 2025]

Title:Counterfactual Explanations as Plans

Authors:Vaishak Belle (University of Edinburgh)

View PDF

Abstract:There has been considerable recent interest in explainability in AI, especially with black-box machine learning models. As correctly observed by the planning community, when the application at hand is not a single-shot decision or prediction, but a sequence of actions that depend on observations, a richer notion of explanations are desirable.
In this paper, we look to provide a formal account of ``counterfactual explanations," based in terms of action sequences. We then show that this naturally leads to an account of model reconciliation, which might take the form of the user correcting the agent's model, or suggesting actions to the agent's plan. For this, we will need to articulate what is true versus what is known, and we appeal to a modal fragment of the situation calculus to formalise these intuitions. We consider various settings: the agent knowing partial truths, weakened truths and having false beliefs, and show that our definitions easily generalize to these different settings.

Comments:	In Proceedings ICLP 2024, arXiv:2502.08453
Subjects:	Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
Cite as:	arXiv:2502.09205 [cs.AI]
	(or arXiv:2502.09205v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2502.09205
Journal reference:	EPTCS 416, 2025, pp. 153-167
Related DOI:	https://doi.org/10.4204/EPTCS.416.14

Submission history

From: EPTCS [view email] [via EPTCS proxy]
[v1] Thu, 13 Feb 2025 11:45:54 UTC (78 KB)

Computer Science > Artificial Intelligence

Title:Counterfactual Explanations as Plans

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Counterfactual Explanations as Plans

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators