Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs

Wang, Wenxuan; Liu, Xiaoyuan; Gao, Kuiyi; Huang, Jen-tse; Yuan, Youliang; He, Pinjia; Wang, Shuai; Tu, Zhaopeng

Computer Science > Computation and Language

arXiv:2502.11184 (cs)

[Submitted on 16 Feb 2025]

Title:Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs

Authors:Wenxuan Wang, Xiaoyuan Liu, Kuiyi Gao, Jen-tse Huang, Youliang Yuan, Pinjia He, Shuai Wang, Zhaopeng Tu

View PDF HTML (experimental)

Abstract:Multimodal Large Language Models (MLLMs) have expanded the capabilities of traditional language models by enabling interaction through both text and images. However, ensuring the safety of these models remains a significant challenge, particularly in accurately identifying whether multimodal content is safe or unsafe-a capability we term safety awareness. In this paper, we introduce MMSafeAware, the first comprehensive multimodal safety awareness benchmark designed to evaluate MLLMs across 29 safety scenarios with 1500 carefully curated image-prompt pairs. MMSafeAware includes both unsafe and over-safety subsets to assess models abilities to correctly identify unsafe content and avoid over-sensitivity that can hinder helpfulness. Evaluating nine widely used MLLMs using MMSafeAware reveals that current models are not sufficiently safe and often overly sensitive; for example, GPT-4V misclassifies 36.1% of unsafe inputs as safe and 59.9% of benign inputs as unsafe. We further explore three methods to improve safety awareness-prompting-based approaches, visual contrastive decoding, and vision-centric reasoning fine-tuning-but find that none achieve satisfactory performance. Our findings highlight the profound challenges in developing MLLMs with robust safety awareness, underscoring the need for further research in this area. All the code and data will be publicly available to facilitate future research.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
Cite as:	arXiv:2502.11184 [cs.CL]
	(or arXiv:2502.11184v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.11184

Submission history

From: Wenxuan Wang [view email]
[v1] Sun, 16 Feb 2025 16:12:40 UTC (27,391 KB)

Computer Science > Computation and Language

Title:Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators