Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models

Zeng, Fanhu; Cheng, Zhen; Zhu, Fei; Zhang, Xu-Yao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.20492 (cs)

[Submitted on 26 Mar 2025]

Title:Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models

Authors:Fanhu Zeng, Zhen Cheng, Fei Zhu, Xu-Yao Zhang

View PDF HTML (experimental)

Abstract:Reliable prediction by classifiers is crucial for their deployment in high security and dynamically changing situations. However, modern neural networks often exhibit overconfidence for misclassified predictions, highlighting the need for confidence estimation to detect errors. Despite the achievements obtained by existing methods on small-scale datasets, they all require training from scratch and there are no efficient and effective misclassification detection (MisD) methods, hindering practical application towards large-scale and ever-changing datasets. In this paper, we pave the way to exploit vision language model (VLM) leveraging text information to establish an efficient and general-purpose misclassification detection framework. By harnessing the power of VLM, we construct FSMisD, a Few-Shot prompt learning framework for MisD to refrain from training from scratch and therefore improve tuning efficiency. To enhance misclassification detection ability, we use adaptive pseudo sample generation and a novel negative loss to mitigate the issue of overconfidence by pushing category prompts away from pseudo features. We conduct comprehensive experiments with prompt learning methods and validate the generalization ability across various datasets with domain shift. Significant and consistent improvement demonstrates the effectiveness, efficiency and generalizability of our approach.

Comments:	preprint
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2503.20492 [cs.CV]
	(or arXiv:2503.20492v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.20492

Submission history

From: Fanhu Zeng [view email]
[v1] Wed, 26 Mar 2025 12:31:04 UTC (1,584 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators