Boosting Large Language Models with Mask Fine-Tuning

Zhang, Mingyuan; Bai, Yue; Wang, Huan; Wang, Yizhou; Dong, Qihua; Fu, Yun

Computer Science > Computation and Language

arXiv:2503.22764 (cs)

[Submitted on 27 Mar 2025]

Title:Boosting Large Language Models with Mask Fine-Tuning

Authors:Mingyuan Zhang, Yue Bai, Huan Wang, Yizhou Wang, Qihua Dong, Yun Fu

View PDF HTML (experimental)

Abstract:The model is usually kept integral in the mainstream large language model (LLM) fine-tuning protocols. No works have questioned whether maintaining the integrity of the model is indispensable for performance. In this work, we introduce Mask Fine-Tuning (MFT), a brand-new LLM fine-tuning paradigm to show that properly breaking the integrity of the model can surprisingly lead to improved performance. Specifically, MFT learns a set of binary masks supervised by the typical LLM fine-tuning objective. Extensive experiments show that MFT gains a consistent performance boost across various domains and backbones (e.g., 1.95%/1.88% average gain in coding with LLaMA2-7B/3.1-8B). Detailed procedures are provided to study the proposed MFT from different hyperparameter perspectives for better insight. In particular, MFT naturally updates the current LLM training protocol by deploying it on a complete well-trained model. This study extends the functionality of mask learning from its conventional network pruning context for model compression to a more general scope.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2503.22764 [cs.CL]
	(or arXiv:2503.22764v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2503.22764

Submission history

From: Yue Bai [view email]
[v1] Thu, 27 Mar 2025 20:17:57 UTC (3,119 KB)

Computer Science > Computation and Language

Title:Boosting Large Language Models with Mask Fine-Tuning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Boosting Large Language Models with Mask Fine-Tuning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators