Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging -- An Open Recipe

Pipatanakul, Kunat; Taveekitworachai, Pittawat; Manakul, Potsawee; Tharnpipitchai, Kasima

Computer Science > Computation and Language

arXiv:2502.09056 (cs)

[Submitted on 13 Feb 2025 (v1), last revised 27 Mar 2025 (this version, v3)]

Title:Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging -- An Open Recipe

Authors:Kunat Pipatanakul, Pittawat Taveekitworachai, Potsawee Manakul, Kasima Tharnpipitchai

View PDF HTML (experimental)

Abstract:This paper investigates data selection and model merging methodologies aimed at incorporating advanced reasoning capabilities such as those of DeepSeek R1 into language-specific large language models (LLMs), with a particular focus on the Thai LLM. Our goal is to enhance the reasoning capabilities of language-specific LLMs while maintaining their target language abilities. DeepSeek R1 excels in reasoning but primarily benefits high-resource languages such as English and Chinese. However, low-resource languages remain underserved due to the dominance of English-centric training data and model optimizations, which limit performance in these languages. This limitation results in unreliable code-switching and diminished effectiveness on tasks in low-resource languages. Meanwhile, local and regional LLM initiatives have attempted to bridge this gap by developing language-specific LLMs that focus on improving local linguistic fidelity. We demonstrate that, with only publicly available datasets and a computational budget of $120, it is possible to enhance the reasoning capabilities of language-specific LLMs to match the level of DeepSeek R1, without compromising their performance on target language tasks.

Comments:	9 pages
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2502.09056 [cs.CL]
	(or arXiv:2502.09056v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.09056

Submission history

From: Kunat Pipatanakul [view email]
[v1] Thu, 13 Feb 2025 08:10:45 UTC (994 KB)
[v2] Mon, 17 Feb 2025 13:16:00 UTC (994 KB)
[v3] Thu, 27 Mar 2025 06:45:16 UTC (994 KB)

Computer Science > Computation and Language

Title:Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging -- An Open Recipe

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging -- An Open Recipe

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators