Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing

Lai, Kunfeng; Tang, Zhenheng; Pan, Xinglin; Dong, Peijie; Liu, Xiang; Chen, Haolan; Shen, Li; Li, Bo; Chu, Xiaowen

Computer Science > Machine Learning

arXiv:2502.04411 (cs)

[Submitted on 6 Feb 2025 (v1), last revised 11 Feb 2025 (this version, v2)]

Title:Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing

Authors:Kunfeng Lai, Zhenheng Tang, Xinglin Pan, Peijie Dong, Xiang Liu, Haolan Chen, Li Shen, Bo Li, Xiaowen Chu

View PDF HTML (experimental)

Abstract:Model merging aggregates Large Language Models (LLMs) finetuned on different tasks into a stronger one. However, parameter conflicts between models leads to performance degradation in averaging. While model routing addresses this issue by selecting individual models during inference, it imposes excessive storage and compute costs, and fails to leverage the common knowledge from different models. In this work, we observe that different layers exhibit varying levels of parameter conflicts. Building on this insight, we average layers with minimal parameter conflicts and use a novel task-level expert routing for layers with significant conflicts. To further reduce storage costs, inspired by task arithmetic sparsity, we decouple multiple fine-tuned experts into a dense expert and several sparse experts. Considering the out-of-distribution samples, we select and merge appropriate experts based on the task uncertainty of the input data. We conduct extensive experiments on both LLaMA and Qwen with varying parameter scales, and evaluate on real-world reasoning tasks. Results demonstrate that our method consistently achieves significant performance improvements while requiring less system cost compared to existing methods.

Comments:	work in progress. arXiv admin note: text overlap with arXiv:2405.09673 by other authors
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
MSC classes:	68T50
Cite as:	arXiv:2502.04411 [cs.LG]
	(or arXiv:2502.04411v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.04411

Submission history

From: Xinglin Pan [view email]
[v1] Thu, 6 Feb 2025 11:26:30 UTC (931 KB)
[v2] Tue, 11 Feb 2025 12:09:51 UTC (931 KB)

Computer Science > Machine Learning

Title:Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators