BADTV: Unveiling Backdoor Threats in Third-Party Task Vectors

Hsu, Chia-Yi; Tsai, Yu-Lin; Zhe, Yu; Chen, Yan-Lun; Lin, Chih-Hsun; Yu, Chia-Mu; Zhang, Yang; Huang, Chun-Ying; Sakuma, Jun

Computer Science > Machine Learning

arXiv:2501.02373 (cs)

[Submitted on 4 Jan 2025]

Title:BADTV: Unveiling Backdoor Threats in Third-Party Task Vectors

Authors:Chia-Yi Hsu, Yu-Lin Tsai, Yu Zhe, Yan-Lun Chen, Chih-Hsun Lin, Chia-Mu Yu, Yang Zhang, Chun-Ying Huang, Jun Sakuma

View PDF HTML (experimental)

Abstract:Task arithmetic in large-scale pre-trained models enables flexible adaptation to diverse downstream tasks without extensive re-training. By leveraging task vectors (TVs), users can perform modular updates to pre-trained models through simple arithmetic operations like addition and subtraction. However, this flexibility introduces new security vulnerabilities. In this paper, we identify and evaluate the susceptibility of TVs to backdoor attacks, demonstrating how malicious actors can exploit TVs to compromise model integrity. By developing composite backdoors and eliminating redudant clean tasks, we introduce BadTV, a novel backdoor attack specifically designed to remain effective under task learning, forgetting, and analogies operations. Our extensive experiments reveal that BadTV achieves near-perfect attack success rates across various scenarios, significantly impacting the security of models using task arithmetic. We also explore existing defenses, showing that current methods fail to detect or mitigate BadTV. Our findings highlight the need for robust defense mechanisms to secure TVs in real-world applications, especially as TV services become more popular in machine-learning ecosystems.

Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Cite as:	arXiv:2501.02373 [cs.LG]
	(or arXiv:2501.02373v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2501.02373

Submission history

From: Chia-Yi Hsu [view email]
[v1] Sat, 4 Jan 2025 20:18:33 UTC (6,182 KB)

Computer Science > Machine Learning

Title:BADTV: Unveiling Backdoor Threats in Third-Party Task Vectors

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:BADTV: Unveiling Backdoor Threats in Third-Party Task Vectors

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators