GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning

Tung, Luu Quy; Viet, Hoang Quoc; Thu, Vo Trong

Computer Science > Computation and Language

arXiv:2504.16832 (cs)

[Submitted on 23 Apr 2025]

Title:GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning

Authors:Luu Quy Tung, Hoang Quoc Viet, Vo Trong Thu

View PDF HTML (experimental)

Abstract:Chain-of-Thought (CoT) is a robust approach for tackling LLM tasks that require intermediate reasoning steps prior to generating a final answer. In this paper, we present GreenMind-Medium-14B-R1, the Vietnamese reasoning model inspired by the finetuning strategy based on Group Relative Policy Optimization. We also leverage a high-quality Vietnamese synthesized reasoning dataset and design two reward functions to tackle the main limitations of this technique: (i) language mixing, where we explicitly detect the presence of biased language characters during the process of sampling tokens, and (ii) we leverage Sentence Transformer-based models to ensure that the generated reasoning content maintains factual correctness and does not distort the final output. Experimental results on the Vietnamese dataset from the VLSP 2023 Challenge demonstrate that our model outperforms prior works and enhances linguistic consistency in its responses. Furthermore, we extend our evaluation to SeaExam-a multilingual multiple-choice dataset, showing the effectiveness of our reasoning method compared to few-shot prompting techniques.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2504.16832 [cs.CL]
	(or arXiv:2504.16832v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.16832

Submission history

From: Thu Vo Trong [view email]
[v1] Wed, 23 Apr 2025 15:48:55 UTC (364 KB)

Computer Science > Computation and Language

Title:GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators