SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Xiao, Teng; Yuan, Yige; Chen, Zhengyu; Li, Mingxiao; Liang, Shangsong; Ren, Zhaochun; Honavar, Vasant G

Computer Science > Machine Learning

arXiv:2502.00883 (cs)

[Submitted on 2 Feb 2025 (v1), last revised 20 Feb 2025 (this version, v4)]

Title:SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Authors:Teng Xiao, Yige Yuan, Zhengyu Chen, Mingxiao Li, Shangsong Liang, Zhaochun Ren, Vasant G Honavar

View PDF HTML (experimental)

Abstract:Existing preference optimization objectives for language model alignment require additional hyperparameters that must be extensively tuned to achieve optimal performance, increasing both the complexity and time required for fine-tuning large language models. In this paper, we propose a simple yet effective hyperparameter-free preference optimization algorithm for alignment. We observe that promising performance can be achieved simply by optimizing inverse perplexity, which is calculated as the inverse of the exponentiated average log-likelihood of the chosen and rejected responses in the preference dataset. The resulting simple learning objective, SimPER, is easy to implement and eliminates the need for expensive hyperparameter tuning and a reference model, making it both computationally and memory efficient. Extensive experiments on widely used real-world benchmarks, including MT-Bench, AlpacaEval 2, and 10 key benchmarks of the Open LLM Leaderboard with 5 base models, demonstrate that SimPER consistently and significantly outperforms existing approaches-even without any hyperparameters or a reference model . For example, despite its simplicity, SimPER outperforms state-of-the-art methods by up to 5.7 points on AlpacaEval 2 and achieves the highest average ranking across 10 benchmarks on the Open LLM Leaderboard. The source code for SimPER is publicly available at: this https URL.

Comments:	ICLR 2025
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL)
Cite as:	arXiv:2502.00883 [cs.LG]
	(or arXiv:2502.00883v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.00883

Submission history

From: Teng Xiao [view email]
[v1] Sun, 2 Feb 2025 19:25:41 UTC (1,023 KB)
[v2] Tue, 4 Feb 2025 16:02:53 UTC (1,023 KB)
[v3] Tue, 18 Feb 2025 02:09:35 UTC (1,023 KB)
[v4] Thu, 20 Feb 2025 15:26:44 UTC (1,023 KB)

Computer Science > Machine Learning

Title:SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators