Declarative Approaches to Counterfactual Explanations for Classification

Bertossi, Leopoldo

Computer Science > Artificial Intelligence

arXiv:2011.07423 (cs)

[Submitted on 15 Nov 2020 (v1), last revised 7 Dec 2021 (this version, v3)]

Title:Declarative Approaches to Counterfactual Explanations for Classification

Authors:Leopoldo Bertossi

View PDF

Abstract:We propose answer-set programs that specify and compute counterfactual interventions on entities that are input on a classification model. In relation to the outcome of the model, the resulting counterfactual entities serve as a basis for the definition and computation of causality-based explanation scores for the feature values in the entity under classification, namely "responsibility scores". The approach and the programs can be applied with black-box models, and also with models that can be specified as logic programs, such as rule-based classifiers. The main focus of this work is on the specification and computation of "best" counterfactual entities, i.e. those that lead to maximum responsibility scores. From them one can read off the explanations as maximum responsibility feature values in the original entity. We also extend the programs to bring into the picture semantic or domain knowledge. We show how the approach could be extended by means of probabilistic methods, and how the underlying probability distributions could be modified through the use of constraints. Several examples of programs written in the syntax of the DLV ASP-solver, and run with it, are shown.

Comments:	Camera-ready of journal version, with some final additions and revisions. Revised and considerably extended version of a RuleML-RR'20 paper [arXiv:2004.13237]. Submitted by invitation
Subjects:	Artificial Intelligence (cs.AI); Databases (cs.DB); Machine Learning (cs.LG); Logic in Computer Science (cs.LO)
Cite as:	arXiv:2011.07423 [cs.AI]
	(or arXiv:2011.07423v3 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2011.07423

Submission history

From: Leopoldo Bertossi [view email]
[v1] Sun, 15 Nov 2020 00:44:33 UTC (36 KB)
[v2] Sat, 5 Jun 2021 01:33:29 UTC (142 KB)
[v3] Tue, 7 Dec 2021 23:57:07 UTC (143 KB)

Computer Science > Artificial Intelligence

Title:Declarative Approaches to Counterfactual Explanations for Classification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Declarative Approaches to Counterfactual Explanations for Classification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators