END: Early Noise Dropping for Efficient and Effective Context Denoising

Jin, Hongye; Chen, Pei; Yang, Jingfeng; Wang, Zhengyang; Jiang, Meng; Gao, Yifan; Huang, Binxuan; Zhang, Xinyang; Li, Zheng; Liu, Tianyi; Li, Huasheng; Yin, Bing

Computer Science > Computation and Language

arXiv:2502.18915 (cs)

[Submitted on 26 Feb 2025]

Title:END: Early Noise Dropping for Efficient and Effective Context Denoising

Authors:Hongye Jin, Pei Chen, Jingfeng Yang, Zhengyang Wang, Meng Jiang, Yifan Gao, Binxuan Huang, Xinyang Zhang, Zheng Li, Tianyi Liu, Huasheng Li, Bing Yin

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, they are often distracted by irrelevant or noisy context in input sequences that degrades output quality. This problem affects both long- and short-context scenarios, such as retrieval-augmented generation, table question-answering, and in-context learning. We reveal that LLMs can implicitly identify whether input sequences contain useful information at early layers, prior to token generation. Leveraging this insight, we introduce Early Noise Dropping (\textsc{END}), a novel approach to mitigate this issue without requiring fine-tuning the LLMs. \textsc{END} segments input sequences into chunks and employs a linear prober on the early layers of LLMs to differentiate between informative and noisy chunks. By discarding noisy chunks early in the process, \textsc{END} preserves critical information, reduces distraction, and lowers computational overhead. Extensive experiments demonstrate that \textsc{END} significantly improves both performance and efficiency across different LLMs on multiple evaluation datasets. Furthermore, by investigating LLMs' implicit understanding to the input with the prober, this work also deepens understanding of how LLMs do reasoning with contexts internally.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2502.18915 [cs.CL]
	(or arXiv:2502.18915v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.18915

Submission history

From: Hongye Jin [view email]
[v1] Wed, 26 Feb 2025 08:07:17 UTC (1,033 KB)

Computer Science > Computation and Language

Title:END: Early Noise Dropping for Efficient and Effective Context Denoising

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:END: Early Noise Dropping for Efficient and Effective Context Denoising

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators