CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

Sun, Guangzhi; Zhan, Xiao; Feng, Shutong; Woodland, Philip C.; Such, Jose

Computer Science > Computation and Language

arXiv:2501.14940 (cs)

[Submitted on 24 Jan 2025 (v1), last revised 4 Feb 2025 (this version, v2)]

Title:CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

Authors:Guangzhi Sun, Xiao Zhan, Shutong Feng, Philip C. Woodland, Jose Such

View PDF HTML (experimental)

Abstract:Aligning large language models (LLMs) with human values is essential for their safe deployment and widespread adoption. Current LLM safety benchmarks often focus solely on the refusal of individual problematic queries, which overlooks the importance of the context where the query occurs and may cause undesired refusal of queries under safe contexts that diminish user experience. Addressing this gap, we introduce CASE-Bench, a Context-Aware SafEty Benchmark that integrates context into safety assessments of LLMs. CASE-Bench assigns distinct, formally described contexts to categorized queries based on Contextual Integrity theory. Additionally, in contrast to previous studies which mainly rely on majority voting from just a few annotators, we recruited a sufficient number of annotators necessary to ensure the detection of statistically significant differences among the experimental conditions based on power analysis. Our extensive analysis using CASE-Bench on various open-source and commercial LLMs reveals a substantial and significant influence of context on human judgments (p<0.0001 from a z-test), underscoring the necessity of context in safety evaluations. We also identify notable mismatches between human judgments and LLM responses, particularly in commercial models within safe contexts.

Comments:	24 pages
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2501.14940 [cs.CL]
	(or arXiv:2501.14940v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.14940

Submission history

From: Xiao Zhan [view email]
[v1] Fri, 24 Jan 2025 21:55:14 UTC (5,423 KB)
[v2] Tue, 4 Feb 2025 20:40:32 UTC (6,226 KB)

Computer Science > Computation and Language

Title:CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators