Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks

Hashimoto, Wataru; Kamigaito, Hidetaka; Watanabe, Taro

Computer Science > Computation and Language

arXiv:2407.02138 (cs)

[Submitted on 2 Jul 2024]

Title:Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks

Authors:Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe

View PDF HTML (experimental)

Abstract:Trustworthy prediction in Deep Neural Networks (DNNs), including Pre-trained Language Models (PLMs) is important for safety-critical applications in the real world. However, DNNs often suffer from uncertainty estimation, such as miscalibration. In particular, approaches that require multiple stochastic inference can mitigate this problem, but the expensive cost of inference makes them impractical. In this study, we propose $k$-Nearest Neighbor Uncertainty Estimation ($k$NN-UE), which is an uncertainty estimation method that uses the distances from the neighbors and label-existence ratio of neighbors. Experiments on sentiment analysis, natural language inference, and named entity recognition show that our proposed method outperforms the baselines or recent density-based methods in confidence calibration, selective prediction, and out-of-distribution detection. Moreover, our analyses indicate that introducing dimension reduction or approximate nearest neighbor search inspired by recent $k$NN-LM studies reduces the inference overhead without significantly degrading estimation performance when combined them appropriately.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2407.02138 [cs.CL]
	(or arXiv:2407.02138v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.02138

Submission history

From: Wataru Hashimoto [view email]
[v1] Tue, 2 Jul 2024 10:33:31 UTC (177 KB)

Computer Science > Computation and Language

Title:Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators