Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Xie, Tian; Gao, Zitian; Ren, Qingnan; Luo, Haoming; Hong, Yuqian; Dai, Bryan; Zhou, Joey; Qiu, Kai; Wu, Zhirong; Luo, Chong

Computer Science > Computation and Language

arXiv:2502.14768 (cs)

[Submitted on 20 Feb 2025]

Title:Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Authors:Tian Xie, Zitian Gao, Qingnan Ren, Haoming Luo, Yuqian Hong, Bryan Dai, Joey Zhou, Kai Qiu, Zhirong Wu, Chong Luo

View PDF HTML (experimental)

Abstract:Inspired by the success of DeepSeek-R1, we explore the potential of rule-based reinforcement learning (RL) in large reasoning models. To analyze reasoning dynamics, we use synthetic logic puzzles as training data due to their controllable complexity and straightforward answer verification. We make some key technical contributions that lead to effective and stable RL training: a system prompt that emphasizes the thinking and answering process, a stringent format reward function that penalizes outputs for taking shortcuts, and a straightforward training recipe that achieves stable convergence. Our 7B model develops advanced reasoning skills-such as reflection, verification, and summarization-that are absent from the logic corpus. Remarkably, after training on just 5K logic problems, it demonstrates generalization abilities to the challenging math benchmarks AIME and AMC.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2502.14768 [cs.CL]
	(or arXiv:2502.14768v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.14768

Submission history

From: Tian Xie [view email]
[v1] Thu, 20 Feb 2025 17:49:26 UTC (10,935 KB)

Computer Science > Computation and Language

Title:Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators