HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models

Seth, Ashish; Manocha, Dinesh; Agarwal, Chirag

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.20622 (cs)

[Submitted on 29 Dec 2024]

Title:HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models

Authors:Ashish Seth, Dinesh Manocha, Chirag Agarwal

View PDF HTML (experimental)

Abstract:Large Vision-Language Models (LVLMs) have demonstrated remarkable performance in performing complex multimodal tasks. However, they are still plagued by object hallucination: the misidentification or misclassification of objects present in images. To this end, we propose HALLUCINOGEN, a novel visual question answering (VQA) object hallucination attack benchmark that utilizes diverse contextual reasoning prompts to evaluate object hallucination in state-of-the-art LVLMs. We design a series of contextual reasoning hallucination prompts to evaluate LVLMs' ability to accurately identify objects in a target image while asking them to perform diverse visual-language tasks such as identifying, locating or performing visual reasoning around specific objects. Further, we extend our benchmark to high-stakes medical applications and introduce MED-HALLUCINOGEN, hallucination attacks tailored to the biomedical domain, and evaluate the hallucination performance of LVLMs on medical images, a critical area where precision is crucial. Finally, we conduct extensive evaluations of eight LVLMs and two hallucination mitigation strategies across multiple datasets to show that current generic and medical LVLMs remain susceptible to hallucination attacks.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2412.20622 [cs.CV]
	(or arXiv:2412.20622v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.20622

Submission history

From: Ashish Seth [view email]
[v1] Sun, 29 Dec 2024 23:56:01 UTC (3,012 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators