To Trust Or Not To Trust A Classifier

Jiang, Heinrich; Kim, Been; Guan, Melody Y.; Gupta, Maya

Statistics > Machine Learning

arXiv:1805.11783 (stat)

[Submitted on 30 May 2018 (v1), last revised 26 Oct 2018 (this version, v2)]

Title:To Trust Or Not To Trust A Classifier

Authors:Heinrich Jiang, Been Kim, Melody Y. Guan, Maya Gupta

View PDF

Abstract:Knowing when a classifier's prediction can be trusted is useful in many applications and critical for safely using AI. While the bulk of the effort in machine learning research has been towards improving classifier performance, understanding when a classifier's predictions should and should not be trusted has received far less attention. The standard approach is to use the classifier's discriminant or confidence score; however, we show there exists an alternative that is more effective in many situations. We propose a new score, called the trust score, which measures the agreement between the classifier and a modified nearest-neighbor classifier on the testing example. We show empirically that high (low) trust scores produce surprisingly high precision at identifying correctly (incorrectly) classified examples, consistently outperforming the classifier's confidence score as well as many other baselines. Further, under some mild distributional assumptions, we show that if the trust score for an example is high (low), the classifier will likely agree (disagree) with the Bayes-optimal classifier. Our guarantees consist of non-asymptotic rates of statistical consistency under various nonparametric settings and build on recent developments in topological data analysis.

Comments:	NIPS 2018
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1805.11783 [stat.ML]
	(or arXiv:1805.11783v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1805.11783

Submission history

From: Heinrich Jiang [view email]
[v1] Wed, 30 May 2018 02:48:58 UTC (5,433 KB)
[v2] Fri, 26 Oct 2018 20:32:21 UTC (3,431 KB)

Statistics > Machine Learning

Title:To Trust Or Not To Trust A Classifier

Submission history

Access Paper:

References & Citations

1 blog link

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:To Trust Or Not To Trust A Classifier

Submission history

Access Paper:

References & Citations

1 blog link

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators