ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning

Hou, Bairu; Zhang, Yang; Ji, Jiabao; Liu, Yujian; Qian, Kaizhi; Andreas, Jacob; Chang, Shiyu

Computer Science > Computation and Language

arXiv:2504.01296 (cs)

[Submitted on 2 Apr 2025]

Title:ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning

Authors:Bairu Hou, Yang Zhang, Jiabao Ji, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang

View PDF HTML (experimental)

Abstract:We present ThinkPrune, a simple yet effective method for pruning the thinking length for long-thinking LLMs, which has been found to often produce inefficient and redundant thinking processes. Existing preliminary explorations of reducing thinking length primarily focus on forcing the thinking process to early exit, rather than adapting the LLM to optimize and consolidate the thinking process, and therefore the length-performance tradeoff observed so far is sub-optimal. To fill this gap, ThinkPrune offers a simple solution that continuously trains the long-thinking LLMs via reinforcement learning (RL) with an added token limit, beyond which any unfinished thoughts and answers will be discarded, resulting in a zero reward. To further preserve model performance, we introduce an iterative length pruning approach, where multiple rounds of RL are conducted, each with an increasingly more stringent token limit. We observed that ThinkPrune results in a remarkable performance-length tradeoff -- on the AIME24 dataset, the reasoning length of DeepSeek-R1-Distill-Qwen-1.5B can be reduced by half with only 2% drop in performance. We also observed that after pruning, the LLMs can bypass unnecessary steps while keeping the core reasoning process complete. Code is available at this https URL.

Comments:	15 pages, 7 figures
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2504.01296 [cs.CL]
	(or arXiv:2504.01296v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.01296

Submission history

From: Bairu Hou [view email]
[v1] Wed, 2 Apr 2025 01:59:26 UTC (659 KB)

Computer Science > Computation and Language

Title:ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators