Non-Uniform Parameter-Wise Model Merging

Camacho, Albert Manuel Orozco; Horoi, Stefan; Wolf, Guy; Belilovsky, Eugene

Computer Science > Machine Learning

arXiv:2412.15467 (cs)

[Submitted on 20 Dec 2024]

Title:Non-Uniform Parameter-Wise Model Merging

Authors:Albert Manuel Orozco Camacho, Stefan Horoi, Guy Wolf, Eugene Belilovsky

View PDF HTML (experimental)

Abstract:Combining multiple machine learning models has long been a technique for enhancing performance, particularly in distributed settings. Traditional approaches, such as model ensembles, work well, but are expensive in terms of memory and compute. Recently, methods based on averaging model parameters have achieved good results in some settings and have gained popularity. However, merging models initialized differently that do not share a part of their training trajectories can yield worse results than simply using the base models, even after aligning their neurons. In this paper, we introduce a novel approach, Non-uniform Parameter-wise Model Merging, or NP Merge, which merges models by learning the contribution of each parameter to the final model using gradient-based optimization. We empirically demonstrate the effectiveness of our method for merging models of various architectures in multiple settings, outperforming past methods. We also extend NP Merge to handle the merging of multiple models, showcasing its scalability and robustness.

Comments:	9 pages, 1 figure, to be published in the Proceedings of the 9th IEEE Special Session on Machine Learning on Big Data (MLBD 2024)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2412.15467 [cs.LG]
	(or arXiv:2412.15467v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2412.15467

Submission history

From: Albert Manuel Orozco Camacho [view email]
[v1] Fri, 20 Dec 2024 00:05:14 UTC (401 KB)

Computer Science > Machine Learning

Title:Non-Uniform Parameter-Wise Model Merging

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Non-Uniform Parameter-Wise Model Merging

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators