Revisiting Backdoor Attacks against Large Vision-Language Models

Liang, Siyuan; Liang, Jiawei; Pang, Tianyu; Du, Chao; Liu, Aishan; Chang, Ee-Chien; Cao, Xiaochun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.18844v1 (cs)

[Submitted on 27 Jun 2024 (this version), latest version 2 Jul 2024 (v3)]

Title:Revisiting Backdoor Attacks against Large Vision-Language Models

Authors:Siyuan Liang, Jiawei Liang, Tianyu Pang, Chao Du, Aishan Liu, Ee-Chien Chang, Xiaochun Cao

View PDF HTML (experimental)

Abstract:Instruction tuning enhances large vision-language models (LVLMs) but raises security risks through potential backdoor attacks due to their openness. Previous backdoor studies focus on enclosed scenarios with consistent training and testing instructions, neglecting the practical domain gaps that could affect attack effectiveness. This paper empirically examines the generalizability of backdoor attacks during the instruction tuning of LVLMs for the first time, revealing certain limitations of most backdoor strategies in practical scenarios. We quantitatively evaluate the generalizability of six typical backdoor attacks on image caption benchmarks across multiple LVLMs, considering both visual and textual domain offsets. Our findings indicate that attack generalizability is positively correlated with the backdoor trigger's irrelevance to specific images/models and the preferential correlation of the trigger pattern. Additionally, we modify existing backdoor attacks based on the above key observations, demonstrating significant improvements in cross-domain scenario generalizability (+86% attack success rate). Notably, even without access to the instruction datasets, a multimodal instruction set can be successfully poisoned with a very low poisoning rate (0.2%), achieving an attack success rate of over 97%. This paper underscores that even simple traditional backdoor strategies pose a serious threat to LVLMs, necessitating more attention and in-depth research.

Comments:	23 pages, 8 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.18844 [cs.CV]
	(or arXiv:2406.18844v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.18844

Submission history

From: Siyuan Liang [view email]
[v1] Thu, 27 Jun 2024 02:31:03 UTC (2,597 KB)
[v2] Fri, 28 Jun 2024 05:21:13 UTC (2,597 KB)
[v3] Tue, 2 Jul 2024 02:36:01 UTC (2,599 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Revisiting Backdoor Attacks against Large Vision-Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Revisiting Backdoor Attacks against Large Vision-Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators