Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting

Sanyal, Sunny; Prairie, Hayden; Das, Rudrajit; Kavis, Ali; Sanghavi, Sujay

Computer Science > Machine Learning

arXiv:2502.02797 (cs)

[Submitted on 5 Feb 2025]

Title:Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting

Authors:Sunny Sanyal, Hayden Prairie, Rudrajit Das, Ali Kavis, Sujay Sanghavi

View PDF HTML (experimental)

Abstract:Fine-tuning a pre-trained model on a downstream task often degrades its original capabilities, a phenomenon known as "catastrophic forgetting". This is especially an issue when one does not have access to the data and recipe used to develop the pre-trained model. Under this constraint, most existing methods for mitigating forgetting are inapplicable. To address this challenge, we propose a sample weighting scheme for the fine-tuning data solely based on the pre-trained model's losses. Specifically, we upweight the easy samples on which the pre-trained model's loss is low and vice versa to limit the drift from the pre-trained model. Our approach is orthogonal and yet complementary to existing methods; while such methods mostly operate on parameter or gradient space, we concentrate on the sample space. We theoretically analyze the impact of fine-tuning with our method in a linear setting, showing that it stalls learning in a certain subspace which inhibits overfitting to the target task. We empirically demonstrate the efficacy of our method on both language and vision tasks. As an example, when fine-tuning Gemma 2 2B on MetaMathQA, our method results in only a $0.8\%$ drop in accuracy on GSM8K (another math dataset) compared to standard fine-tuning, while preserving $5.4\%$ more accuracy on the pre-training datasets. Our code is publicly available at this https URL .

Comments:	49 pages, 4 figures, 12 tables. Code available at this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2502.02797 [cs.LG]
	(or arXiv:2502.02797v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.02797

Submission history

From: Ali Kavis [view email]
[v1] Wed, 5 Feb 2025 00:49:59 UTC (798 KB)

Computer Science > Machine Learning

Title:Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators