TrustRAG: Enhancing Robustness and Trustworthiness in RAG

Zhou, Huichi; Lee, Kin-Hei; Zhan, Zhonghao; Chen, Yue; Li, Zhenhao

Abstract:Retrieval-Augmented Generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge sources, enabling more accurate and contextually relevant responses tailored to user queries. However, these systems remain vulnerable to corpus poisoning attacks that can significantly degrade LLM performance through the injection of malicious content. To address these challenges, we propose TrustRAG, a robust framework that systematically filters compromised and irrelevant content before it reaches the language model. Our approach implements a two-stage defense mechanism: first, it employs K-means clustering to identify potential attack patterns in retrieved documents based on their semantic embeddings, effectively isolating suspicious content. Second, it leverages cosine similarity and ROUGE metrics to detect malicious documents while resolving discrepancies between the model's internal knowledge and external information through a self-assessment process. TrustRAG functions as a plug-and-play, training-free module that integrates seamlessly with any language model, whether open or closed-source, maintaining high contextual relevance while strengthening defenses against attacks. Through extensive experimental validation, we demonstrate that TrustRAG delivers substantial improvements in retrieval accuracy, efficiency, and attack resistance compared to existing approaches across multiple model architectures and datasets. We have made TrustRAG available as open-source software at \url{this https URL}.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2501.00879 [cs.CL]
	(or arXiv:2501.00879v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.00879

Computer Science > Computation and Language

Title:TrustRAG: Enhancing Robustness and Trustworthiness in RAG

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators