Efficient Solutions For An Intriguing Failure of LLMs: Long Context Window Does Not Mean LLMs Can Analyze Long Sequences Flawlessly

Hosseini, Peyman; Castro, Ignacio; Ghinassi, Iacopo; Purver, Matthew

Computer Science > Computation and Language

arXiv:2408.01866 (cs)

[Submitted on 3 Aug 2024]

Title:Efficient Solutions For An Intriguing Failure of LLMs: Long Context Window Does Not Mean LLMs Can Analyze Long Sequences Flawlessly

Authors:Peyman Hosseini, Ignacio Castro, Iacopo Ghinassi, Matthew Purver

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) have demonstrated remarkable capabilities in comprehending and analyzing lengthy sequential inputs, owing to their extensive context windows that allow processing millions of tokens in a single forward pass. However, this paper uncovers a surprising limitation: LLMs fall short when handling long input sequences. We investigate this issue using three datasets and two tasks (sentiment analysis and news categorization) across various LLMs, including Claude 3, Gemini Pro, GPT 3.5 Turbo, Llama 3 Instruct, and Mistral Instruct models. To address this limitation, we propose and evaluate ad-hoc solutions that substantially enhance LLMs' performance on long input sequences by up to 50%, while reducing API cost and latency by up to 93% and 50%, respectively.

Comments:	11 pages, 5 figures, 6 tables
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
ACM classes:	I.2.7
Cite as:	arXiv:2408.01866 [cs.CL]
	(or arXiv:2408.01866v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2408.01866

Submission history

From: SeyedPeyman Hosseini [view email]
[v1] Sat, 3 Aug 2024 21:31:34 UTC (479 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2024-08

Change to browse by:

cs
cs.LG

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Efficient Solutions For An Intriguing Failure of LLMs: Long Context Window Does Not Mean LLMs Can Analyze Long Sequences Flawlessly

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Efficient Solutions For An Intriguing Failure of LLMs: Long Context Window Does Not Mean LLMs Can Analyze Long Sequences Flawlessly

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators