A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents

Huang, Yuting; Ding, Leilei; Tang, Zhipeng; Wang, Tianfu; Lin, Xinrui; Zhang, Wuyang; Ma, Mingxiao; Zhang, Yanyong

Computer Science > Artificial Intelligence

arXiv:2504.14650 (cs)

[Submitted on 20 Apr 2025]

Title:A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents

Authors:Yuting Huang, Leilei Ding, Zhipeng Tang, Tianfu Wang, Xinrui Lin, Wuyang Zhang, Mingxiao Ma, Yanyong Zhang

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) exhibit substantial promise in enhancing task-planning capabilities within embodied agents due to their advanced reasoning and comprehension. However, the systemic safety of these agents remains an underexplored frontier. In this study, we present Safe-BeAl, an integrated framework for the measurement (SafePlan-Bench) and alignment (Safe-Align) of LLM-based embodied agents' behaviors. SafePlan-Bench establishes a comprehensive benchmark for evaluating task-planning safety, encompassing 2,027 daily tasks and corresponding environments distributed across 8 distinct hazard categories (e.g., Fire Hazard). Our empirical analysis reveals that even in the absence of adversarial inputs or malicious intent, LLM-based agents can exhibit unsafe behaviors. To mitigate these hazards, we propose Safe-Align, a method designed to integrate physical-world safety knowledge into LLM-based embodied agents while maintaining task-specific performance. Experiments across a variety of settings demonstrate that Safe-BeAl provides comprehensive safety validation, improving safety by 8.55 - 15.22%, compared to embodied agents based on GPT-4, while ensuring successful task completion.

Comments:	16 pages, 10 figures
Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2504.14650 [cs.AI]
	(or arXiv:2504.14650v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2504.14650

Submission history

From: Yuting Huang [view email]
[v1] Sun, 20 Apr 2025 15:12:14 UTC (20,700 KB)

Computer Science > Artificial Intelligence

Title:A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators