LongSafety: Evaluating Long-Context Safety of Large Language Models

Lu, Yida; Cheng, Jiale; Zhang, Zhexin; Cui, Shiyao; Wang, Cunxiang; Gu, Xiaotao; Dong, Yuxiao; Tang, Jie; Wang, Hongning; Huang, Minlie

Computer Science > Computation and Language

arXiv:2502.16971 (cs)

[Submitted on 24 Feb 2025]

Title:LongSafety: Evaluating Long-Context Safety of Large Language Models

Authors:Yida Lu, Jiale Cheng, Zhexin Zhang, Shiyao Cui, Cunxiang Wang, Xiaotao Gu, Yuxiao Dong, Jie Tang, Hongning Wang, Minlie Huang

View PDF HTML (experimental)

Abstract:As Large Language Models (LLMs) continue to advance in understanding and generating long sequences, new safety concerns have been introduced through the long context. However, the safety of LLMs in long-context tasks remains under-explored, leaving a significant gap in both evaluation and improvement of their safety. To address this, we introduce LongSafety, the first comprehensive benchmark specifically designed to evaluate LLM safety in open-ended long-context tasks. LongSafety encompasses 7 categories of safety issues and 6 user-oriented long-context tasks, with a total of 1,543 test cases, averaging 5,424 words per context. Our evaluation towards 16 representative LLMs reveals significant safety vulnerabilities, with most models achieving safety rates below 55%. Our findings also indicate that strong safety performance in short-context scenarios does not necessarily correlate with safety in long-context tasks, emphasizing the unique challenges and urgency of improving long-context safety. Moreover, through extensive analysis, we identify challenging safety issues and task types for long-context models. Furthermore, we find that relevant context and extended input sequences can exacerbate safety risks in long-context scenarios, highlighting the critical need for ongoing attention to long-context safety challenges. Our code and data are available at this https URL.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2502.16971 [cs.CL]
	(or arXiv:2502.16971v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.16971

Submission history

From: Yida Lu [view email]
[v1] Mon, 24 Feb 2025 08:54:39 UTC (3,049 KB)

Computer Science > Computation and Language

Title:LongSafety: Evaluating Long-Context Safety of Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:LongSafety: Evaluating Long-Context Safety of Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators