Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Dong, Yichen; Lyu, Xinglin; Li, Junhui; Wei, Daimeng; Zhang, Min; Tao, Shimin; Yang, Hao

Computer Science > Computation and Language

arXiv:2504.05614 (cs)

[Submitted on 8 Apr 2025]

Title:Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Authors:Yichen Dong, Xinglin Lyu, Junhui Li, Daimeng Wei, Min Zhang, Shimin Tao, Hao Yang

View PDF HTML (experimental)

Abstract:Recent research has shown that large language models (LLMs) can enhance translation quality through self-refinement. In this paper, we build on this idea by extending the refinement from sentence-level to document-level translation, specifically focusing on document-to-document (Doc2Doc) translation refinement. Since sentence-to-sentence (Sent2Sent) and Doc2Doc translation address different aspects of the translation process, we propose fine-tuning LLMs for translation refinement using two intermediate translations, combining the strengths of both Sent2Sent and Doc2Doc. Additionally, recognizing that the quality of intermediate translations varies, we introduce an enhanced fine-tuning method with quality awareness that assigns lower weights to easier translations and higher weights to more difficult ones, enabling the model to focus on challenging translation cases. Experimental results across ten translation tasks with LLaMA-3-8B-Instruct and Mistral-Nemo-Instruct demonstrate the effectiveness of our approach.

Comments:	Under Review
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2504.05614 [cs.CL]
	(or arXiv:2504.05614v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.05614

Submission history

From: Yichen Dong [view email]
[v1] Tue, 8 Apr 2025 02:08:07 UTC (906 KB)

Computer Science > Computation and Language

Title:Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators