Behind the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models

Yi, Sibo; Cong, Tianshuo; He, Xinlei; Li, Qi; Song, Jiaxing

Computer Science > Cryptography and Security

arXiv:2502.19883 (cs)

[Submitted on 27 Feb 2025 (v1), last revised 28 Feb 2025 (this version, v2)]

Title:Behind the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models

Authors:Sibo Yi, Tianshuo Cong, Xinlei He, Qi Li, Jiaxing Song

View PDF HTML (experimental)

Abstract:Small language models (SLMs) have become increasingly prominent in the deployment on edge devices due to their high efficiency and low computational cost. While researchers continue to advance the capabilities of SLMs through innovative training strategies and model compression techniques, the security risks of SLMs have received considerably less attention compared to large language models (LLMs).To fill this gap, we provide a comprehensive empirical study to evaluate the security performance of 13 state-of-the-art SLMs under various jailbreak attacks. Our experiments demonstrate that most SLMs are quite susceptible to existing jailbreak attacks, while some of them are even vulnerable to direct harmful this http URL address the safety concerns, we evaluate several representative defense methods and demonstrate their effectiveness in enhancing the security of SLMs. We further analyze the potential security degradation caused by different SLM techniques including architecture compression, quantization, knowledge distillation, and so on. We expect that our research can highlight the security challenges of SLMs and provide valuable insights to future work in developing more robust and secure SLMs.

Comments:	12 pages. 6 figures
Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2502.19883 [cs.CR]
	(or arXiv:2502.19883v2 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2502.19883

Submission history

From: Sibo Yi [view email]
[v1] Thu, 27 Feb 2025 08:44:04 UTC (598 KB)
[v2] Fri, 28 Feb 2025 12:59:26 UTC (598 KB)

Computer Science > Cryptography and Security

Title:Behind the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Behind the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators