Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection

Yang, Le; Zheng, Ziwei; Chen, Boxu; Zhao, Zhengyu; Lin, Chenhao; Shen, Chao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.13817 (cs)

[Submitted on 18 Dec 2024]

Title:Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection

Authors:Le Yang, Ziwei Zheng, Boxu Chen, Zhengyu Zhao, Chenhao Lin, Chao Shen

View PDF HTML (experimental)

Abstract:Recent studies have shown that large vision-language models (LVLMs) often suffer from the issue of object hallucinations (OH). To mitigate this issue, we introduce an efficient method that edits the model weights based on an unsafe subspace, which we call HalluSpace in this paper. With truthful and hallucinated text prompts accompanying the visual content as inputs, the HalluSpace can be identified by extracting the hallucinated embedding features and removing the truthful representations in LVLMs. By orthogonalizing the model weights, input features will be projected into the Null space of the HalluSpace to reduce OH, based on which we name our method Nullu. We reveal that HalluSpaces generally contain statistical bias and unimodal priors of the large language models (LLMs) applied to build LVLMs, which have been shown as essential causes of OH in previous studies. Therefore, null space projection suppresses the LLMs' priors to filter out the hallucinated features, resulting in contextually accurate outputs. Experiments show that our method can effectively mitigate OH across different LVLM families without extra inference costs and also show strong performance in general LVLM benchmarks. Code is released at \url{this https URL}.

Comments:	Under review
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2412.13817 [cs.CV]
	(or arXiv:2412.13817v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.13817

Submission history

From: Ziwei Zheng [view email]
[v1] Wed, 18 Dec 2024 13:04:30 UTC (14,733 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators