Interpretable Companions for Black-Box Models

Pan, Danqing; Wang, Tong; Hara, Satoshi

Statistics > Machine Learning

arXiv:2002.03494 (stat)

[Submitted on 10 Feb 2020 (v1), last revised 11 Feb 2020 (this version, v2)]

Title:Interpretable Companions for Black-Box Models

Authors:Danqing Pan, Tong Wang, Satoshi Hara

View PDF

Abstract:We present an interpretable companion model for any pre-trained black-box classifiers. The idea is that for any input, a user can decide to either receive a prediction from the black-box model, with high accuracy but no explanations, or employ a companion rule to obtain an interpretable prediction with slightly lower accuracy. The companion model is trained from data and the predictions of the black-box model, with the objective combining area under the transparency--accuracy curve and model complexity. Our model provides flexible choices for practitioners who face the dilemma of choosing between always using interpretable models and always using black-box models for a predictive task, so users can, for any given input, take a step back to resort to an interpretable prediction if they find the predictive performance satisfying, or stick to the black-box model if the rules are unsatisfying. To show the value of companion models, we design a human evaluation on more than a hundred people to investigate the tolerable accuracy loss to gain interpretability for humans.

Comments:	15 pages, 6 figures
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2002.03494 [stat.ML]
	(or arXiv:2002.03494v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2002.03494

Submission history

From: Pan Danqing [view email]
[v1] Mon, 10 Feb 2020 01:39:16 UTC (992 KB)
[v2] Tue, 11 Feb 2020 05:38:05 UTC (983 KB)

Statistics > Machine Learning

Title:Interpretable Companions for Black-Box Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Interpretable Companions for Black-Box Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators