ICPC: In-context Prompt Compression with Faster Inference

Yu, Ziyang; Liu, Yuyu

Computer Science > Computation and Language

arXiv:2501.01625 (cs)

[Submitted on 3 Jan 2025]

Title:ICPC: In-context Prompt Compression with Faster Inference

Authors:Ziyang Yu, Yuyu Liu

View PDF HTML (experimental)

Abstract:Despite the recent success of Large Language Models (LLMs), it remains challenging to feed LLMs with long prompts due to the fixed size of LLM inputs. As a remedy, prompt compression becomes a promising solution by removing redundant tokens in the prompt. However, using LLM in the existing works requires additional computation resources and leads to memory overheads. To address it, we propose ICPC (In-context Prompt Compression), a novel and scalable prompt compression method that adaptively reduces the prompt length. The key idea of ICPC is to calculate the probability of each word appearing in the prompt using encoders and calculate information carried by each word through the information function, which effectively reduces the information loss during prompt compression and increases the speed of compression. Empirically, we demonstrate that ICPC can effectively compress long texts of different categories and thus achieve better performance and speed on different types of NLP tasks.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2501.01625 [cs.CL]
	(or arXiv:2501.01625v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.01625

Submission history

From: Eric Yu [view email]
[v1] Fri, 3 Jan 2025 03:46:51 UTC (526 KB)

Computer Science > Computation and Language

Title:ICPC: In-context Prompt Compression with Faster Inference

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:ICPC: In-context Prompt Compression with Faster Inference

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators