Exploring Translation Mechanism of Large Language Models

Zhang, Hongbin; Chen, Kehai; Bai, Xuefeng; Li, Xiucheng; Zhang, Min

Computer Science > Computation and Language

arXiv:2502.11806 (cs)

[Submitted on 17 Feb 2025]

Title:Exploring Translation Mechanism of Large Language Models

Authors:Hongbin Zhang, Kehai Chen, Xuefeng Bai, Xiucheng Li, Min Zhang

View PDF HTML (experimental)

Abstract:Large language models (LLMs) have succeeded remarkably in multilingual translation tasks. However, the inherent translation mechanisms of LLMs remain poorly understood, largely due to sophisticated architectures and vast parameter scales. In response to this issue, this study explores the translation mechanism of LLM from the perspective of computational components (e.g., attention heads and MLPs). Path patching is utilized to explore causal relationships between components, detecting those crucial for translation tasks and subsequently analyzing their behavioral patterns in human-interpretable terms. Comprehensive analysis reveals that translation is predominantly facilitated by a sparse subset of specialized attention heads (less than 5\%), which extract source language, indicator, and positional features. MLPs subsequently integrate and process these features by transiting towards English-centric latent representations. Notably, building on the above findings, targeted fine-tuning of only 64 heads achieves translation improvement comparable to full-parameter tuning while preserving general capabilities.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2502.11806 [cs.CL]
	(or arXiv:2502.11806v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.11806

Submission history

From: Hongbin Zhang [view email]
[v1] Mon, 17 Feb 2025 13:50:29 UTC (2,832 KB)

Computer Science > Computation and Language

Title:Exploring Translation Mechanism of Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Exploring Translation Mechanism of Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators