Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning

Ye, Ziang; Zhang, Zhenru; Zhang, Yang; Ma, Jianxin; Lin, Junyang; Feng, Fuli

Computer Science > Computation and Language

arXiv:2412.14780 (cs)

[Submitted on 19 Dec 2024]

Title:Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning

Authors:Ziang Ye, Zhenru Zhang, Yang Zhang, Jianxin Ma, Junyang Lin, Fuli Feng

View PDF HTML (experimental)

Abstract:When using agent-task datasets to enhance agent capabilities for Large Language Models (LLMs), current methodologies often treat all tokens within a sample equally. However, we argue that tokens serving different roles - specifically, reasoning tokens versus boilerplate tokens (e.g., those governing output format) - differ significantly in importance and learning complexity, necessitating their disentanglement and distinct treatment. To address this, we propose a novel Shuffle-Aware Discriminator (SHAD) for adaptive token discrimination. SHAD classifies tokens by exploiting predictability differences observed after shuffling input-output combinations across samples: boilerplate tokens, due to their repetitive nature among samples, maintain predictability, whereas reasoning tokens do not. Using SHAD, we propose the Reasoning-highlighted Fine-Tuning (RFT) method, which adaptively emphasizes reasoning tokens during fine-tuning, yielding notable performance gains over common Supervised Fine-Tuning (SFT).

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2412.14780 [cs.CL]
	(or arXiv:2412.14780v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.14780

Submission history

From: Ziang Ye [view email]
[v1] Thu, 19 Dec 2024 12:06:24 UTC (1,827 KB)

Computer Science > Computation and Language

Title:Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators