Multi-Step Reasoning in Korean and the Emergent Mirage

Son, Guijin; Ko, Hyunwoo; Choi, Dasol

Computer Science > Computation and Language

arXiv:2501.05712 (cs)

[Submitted on 10 Jan 2025]

Title:Multi-Step Reasoning in Korean and the Emergent Mirage

Authors:Guijin Son, Hyunwoo Ko, Dasol Choi

View PDF HTML (experimental)

Abstract:We introduce HRMCR (HAE-RAE Multi-Step Commonsense Reasoning), a benchmark designed to evaluate large language models' ability to perform multi-step reasoning in culturally specific contexts, focusing on Korean. The questions are automatically generated via templates and algorithms, requiring LLMs to integrate Korean cultural knowledge into sequential reasoning steps. Consistent with prior observations on emergent abilities, our experiments reveal that models trained on fewer than \(2 \cdot 10^{25}\) training FLOPs struggle to solve any questions, showing near-zero performance. Beyond this threshold, performance improves sharply. State-of-the-art models (e.g., O1) still score under 50\%, underscoring the difficulty of our tasks. Notably, stepwise analysis suggests the observed emergent behavior may stem from compounding errors across multiple steps rather than reflecting a genuinely new capability. We publicly release the benchmark and commit to regularly updating the dataset to prevent contamination.

Comments:	11 pages, 7 figures
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2501.05712 [cs.CL]
	(or arXiv:2501.05712v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.05712

Submission history

From: Hyunwoo Ko [view email]
[v1] Fri, 10 Jan 2025 05:07:27 UTC (2,593 KB)

Computer Science > Computation and Language

Title:Multi-Step Reasoning in Korean and the Emergent Mirage

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multi-Step Reasoning in Korean and the Emergent Mirage

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators