Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers

Mahajan, Divyat; Tan, Chenhao; Sharma, Amit

Computer Science > Machine Learning

arXiv:1912.03277 (cs)

[Submitted on 6 Dec 2019 (v1), last revised 12 Jun 2020 (this version, v3)]

Title:Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers

Authors:Divyat Mahajan, Chenhao Tan, Amit Sharma

View PDF

Abstract:To construct interpretable explanations that are consistent with the original ML model, counterfactual examples---showing how the model's output changes with small perturbations to the input---have been proposed. This paper extends the work in counterfactual explanations by addressing the challenge of feasibility of such examples. For explanations of ML models in critical domains such as healthcare and finance, counterfactual examples are useful for an end-user only to the extent that perturbation of feature inputs is feasible in the real world. We formulate the problem of feasibility as preserving causal relationships among input features and present a method that uses (partial) structural causal models to generate actionable counterfactuals. When feasibility constraints cannot be easily expressed, we consider an alternative mechanism where people can label generated CF examples on feasibility: whether it is feasible to intervene and realize the candidate CF example from the original input. To learn from this labelled feasibility data, we propose a modified variational auto encoder loss for generating CF examples that optimizes for feasibility as people interact with its output. Our experiments on Bayesian networks and the widely used ''Adult-Income'' dataset show that our proposed methods can generate counterfactual explanations that better satisfy feasibility constraints than existing methods.. Code repository can be accessed here: \textit{this https URL}

Comments:	2019 NeurIPS Workshop on Do the right thing: Machine learning and Causal Inference for improved decision making
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1912.03277 [cs.LG]
	(or arXiv:1912.03277v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1912.03277

Submission history

From: Divyat Mahajan [view email]
[v1] Fri, 6 Dec 2019 18:16:29 UTC (1,126 KB)
[v2] Sat, 22 Feb 2020 10:18:41 UTC (1,504 KB)
[v3] Fri, 12 Jun 2020 23:46:46 UTC (2,191 KB)

Computer Science > Machine Learning

Title:Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators