Safeguarding System Prompts for LLMs

Jiang, Zhifeng; Jin, Zhihua; He, Guoliang

Computer Science > Cryptography and Security

arXiv:2412.13426 (cs)

[Submitted on 18 Dec 2024 (v1), last revised 9 Jan 2025 (this version, v2)]

Title:Safeguarding System Prompts for LLMs

Authors:Zhifeng Jiang, Zhihua Jin, Guoliang He

View PDF HTML (experimental)

Abstract:Large language models (LLMs) are increasingly utilized in applications where system prompts, which guide model outputs, play a crucial role. These prompts often contain business logic and sensitive information, making their protection essential. However, adversarial and even regular user queries can exploit LLM vulnerabilities to expose these hidden prompts. To address this issue, we propose PromptKeeper, a robust defense mechanism designed to safeguard system prompts. PromptKeeper tackles two core challenges: reliably detecting prompt leakage and mitigating side-channel vulnerabilities when leakage occurs. By framing detection as a hypothesis-testing problem, PromptKeeper effectively identifies both explicit and subtle leakage. Upon detection, it regenerates responses using a dummy prompt, ensuring that outputs remain indistinguishable from typical interactions when no leakage is present. PromptKeeper ensures robust protection against prompt extraction attacks via either adversarial or regular queries, while preserving conversational capability and runtime efficiency during benign user interactions.

Comments:	15 pages, 5 figures, 2 tables
Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2412.13426 [cs.CR]
	(or arXiv:2412.13426v2 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2412.13426

Submission history

From: Zhifeng Jiang [view email]
[v1] Wed, 18 Dec 2024 01:43:25 UTC (1,079 KB)
[v2] Thu, 9 Jan 2025 14:33:25 UTC (863 KB)

Computer Science > Cryptography and Security

Title:Safeguarding System Prompts for LLMs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Safeguarding System Prompts for LLMs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators