Computer Science > Machine Learning
[Submitted on 6 Dec 2019 (v1), last revised 12 Jun 2020 (this version, v3)]
Title:Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers
View PDFAbstract:To construct interpretable explanations that are consistent with the original ML model, counterfactual examples---showing how the model's output changes with small perturbations to the input---have been proposed. This paper extends the work in counterfactual explanations by addressing the challenge of feasibility of such examples. For explanations of ML models in critical domains such as healthcare and finance, counterfactual examples are useful for an end-user only to the extent that perturbation of feature inputs is feasible in the real world. We formulate the problem of feasibility as preserving causal relationships among input features and present a method that uses (partial) structural causal models to generate actionable counterfactuals. When feasibility constraints cannot be easily expressed, we consider an alternative mechanism where people can label generated CF examples on feasibility: whether it is feasible to intervene and realize the candidate CF example from the original input. To learn from this labelled feasibility data, we propose a modified variational auto encoder loss for generating CF examples that optimizes for feasibility as people interact with its output. Our experiments on Bayesian networks and the widely used ''Adult-Income'' dataset show that our proposed methods can generate counterfactual explanations that better satisfy feasibility constraints than existing methods.. Code repository can be accessed here: \textit{this https URL}
Submission history
From: Divyat Mahajan [view email][v1] Fri, 6 Dec 2019 18:16:29 UTC (1,126 KB)
[v2] Sat, 22 Feb 2020 10:18:41 UTC (1,504 KB)
[v3] Fri, 12 Jun 2020 23:46:46 UTC (2,191 KB)
Current browse context:
stat
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender
(What is IArxiv?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.