R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning

He, Minggui; Liu, Yilun; Tao, Shimin; Luo, Yuanchang; Zeng, Hongyong; Su, Chang; Zhang, Li; Ma, Hongxia; Wei, Daimeng; Meng, Weibin; Yang, Hao; Chen, Boxing; Yoshie, Osamu

Abstract:Despite recent breakthroughs in reasoning-enhanced large language models (LLMs) like DeepSeek-R1, incorporating inference-time reasoning into machine translation (MT), where human translators naturally employ structured, multi-layered reasoning chain-of-thoughts (CoTs), is yet underexplored. Existing methods either design a fixed CoT tailored for a specific MT sub-task (e.g., literature translation), or rely on synthesizing CoTs unaligned with humans and supervised fine-tuning (SFT) prone to catastrophic forgetting, limiting their adaptability to diverse translation scenarios. This paper introduces R1-Translator (R1-T1), a novel framework to achieve inference-time reasoning for general MT via reinforcement learning (RL) with human-aligned CoTs comprising six common patterns. Our approach pioneers three innovations: (1) extending reasoning-based translation beyond MT sub-tasks to six languages and diverse tasks (e.g., legal/medical domain adaptation, idiom resolution); (2) formalizing six expert-curated CoT templates that mirror hybrid human strategies like context-aware paraphrasing and back translation; and (3) enabling self-evolving CoT discovery and anti-forgetting adaptation through RL with KL-constrained rewards. Experimental results indicate a steady translation performance improvement in 21 languages and 80 translation directions on Flores-101 test set, especially on the 15 languages unseen from training, with its general multilingual abilities preserved compared with plain SFT.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2502.19735 [cs.CL]
	(or arXiv:2502.19735v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.19735

Computer Science > Computation and Language

Title:R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators