The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction

Hong, Yihuai; Zhou, Dian; Cao, Meng; Yu, Lei; Jin, Zhijing

Computer Science > Computation and Language

arXiv:2503.23084 (cs)

[Submitted on 29 Mar 2025]

Title:The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction

Authors:Yihuai Hong, Dian Zhou, Meng Cao, Lei Yu, Zhijing Jin

View PDF HTML (experimental)

Abstract:Large language models (LLMs) excel on a variety of reasoning benchmarks, but previous studies suggest they sometimes struggle to generalize to unseen questions, potentially due to over-reliance on memorized training examples. However, the precise conditions under which LLMs switch between reasoning and memorization during text generation remain unclear. In this work, we provide a mechanistic understanding of LLMs' reasoning-memorization dynamics by identifying a set of linear features in the model's residual stream that govern the balance between genuine reasoning and memory recall. These features not only distinguish reasoning tasks from memory-intensive ones but can also be manipulated to causally influence model performance on reasoning tasks. Additionally, we show that intervening in these reasoning features helps the model more accurately activate the most relevant problem-solving capabilities during answer generation. Our findings offer new insights into the underlying mechanisms of reasoning and memory in LLMs and pave the way for the development of more robust and interpretable generative AI systems.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2503.23084 [cs.CL]
	(or arXiv:2503.23084v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2503.23084

Submission history

From: Yihuai Hong [view email]
[v1] Sat, 29 Mar 2025 14:00:44 UTC (6,325 KB)

Computer Science > Computation and Language

Title:The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators