SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings

Lu, Weikai; Peng, Hao; Zhuang, Huiping; Chen, Cen; Zeng, Ziqian

Computer Science > Computation and Language

arXiv:2502.12562 (cs)

[Submitted on 18 Feb 2025]

Title:SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings

Authors:Weikai Lu, Hao Peng, Huiping Zhuang, Cen Chen, Ziqian Zeng

View PDF HTML (experimental)

Abstract:Multimodal Large Language Models (MLLMs) have serious security this http URL safety alignment using multimodal datasets consisting of text and data of additional modalities can effectively enhance MLLM's security, it is costly to construct these datasets. Existing low-resource security alignment methods, including textual alignment, have been found to struggle with the security risks posed by additional modalities. To address this, we propose Synthetic Embedding augmented safety Alignment (SEA), which optimizes embeddings of additional modality through gradient updates to expand textual datasets. This enables multimodal safety alignment training even when only textual data is available. Extensive experiments on image, video, and audio-based MLLMs demonstrate that SEA can synthesize a high-quality embedding on a single RTX3090 GPU within 24 seconds. SEA significantly improves the security of MLLMs when faced with threats from additional modalities. To assess the security risks introduced by video and audio, we also introduced a new benchmark called VA-SafetyBench. High attack success rates across multiple MLLMs validate its challenge. Our code and data will be available at this https URL.

Subjects:	Computation and Language (cs.CL); Cryptography and Security (cs.CR); Multimedia (cs.MM)
Cite as:	arXiv:2502.12562 [cs.CL]
	(or arXiv:2502.12562v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.12562

Submission history

From: Weikai Lu [view email]
[v1] Tue, 18 Feb 2025 05:57:35 UTC (845 KB)

Computer Science > Computation and Language

Title:SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators