A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

Liu, Daizong; Yang, Mingyu; Qu, Xiaoye; Zhou, Pan; Cheng, Yu; Hu, Wei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.07403 (cs)

[Submitted on 10 Jul 2024 (v1), last revised 12 Jul 2024 (this version, v2)]

Title:A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

Authors:Daizong Liu, Mingyu Yang, Xiaoye Qu, Pan Zhou, Yu Cheng, Wei Hu

View PDF HTML (experimental)

Abstract:With the significant development of large models in recent years, Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities across a wide range of multimodal understanding and reasoning tasks. Compared to traditional Large Language Models (LLMs), LVLMs present great potential and challenges due to its closer proximity to the multi-resource real-world applications and the complexity of multi-modal processing. However, the vulnerability of LVLMs is relatively underexplored, posing potential security risks in daily usage. In this paper, we provide a comprehensive review of the various forms of existing LVLM attacks. Specifically, we first introduce the background of attacks targeting LVLMs, including the attack preliminary, attack challenges, and attack resources. Then, we systematically review the development of LVLM attack methods, such as adversarial attacks that manipulate model outputs, jailbreak attacks that exploit model vulnerabilities for unauthorized actions, prompt injection attacks that engineer the prompt type and pattern, and data poisoning that affects model training. Finally, we discuss promising research directions in the future. We believe that our survey provides insights into the current landscape of LVLM vulnerabilities, inspiring more researchers to explore and mitigate potential safety issues in LVLM developments. The latest papers on LVLM attacks are continuously collected in this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2407.07403 [cs.CV]
	(or arXiv:2407.07403v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.07403

Submission history

From: Daizong Liu [view email]
[v1] Wed, 10 Jul 2024 06:57:58 UTC (462 KB)
[v2] Fri, 12 Jul 2024 03:58:05 UTC (462 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators