Design Considerations in Offline Preference-based RL

Agarwal, Alekh; Dann, Christoph; Marinov, Teodor V.

Computer Science > Machine Learning

arXiv:2502.06861 (cs)

[Submitted on 8 Feb 2025]

Title:Design Considerations in Offline Preference-based RL

Authors:Alekh Agarwal, Christoph Dann, Teodor V. Marinov

View PDF HTML (experimental)

Abstract:Offline algorithms for Reinforcement Learning from Human Preferences (RLHF), which use only a fixed dataset of sampled responses given an input, and preference feedback among these responses, have gained increasing prominence in the literature on aligning language models. In this paper, we study how the different design choices made in methods such as DPO, IPO, SLiC and many variants influence the quality of the learned policy, from a theoretical perspective. Our treatment yields insights into the choices of loss function, the policy which is used to normalize log-likelihoods, and also the role of the data sampling policy. Notably, our results do not rely on the standard reparameterization-style arguments used to motivate some of the algorithms in this family, which allows us to give a unified treatment to a broad class of methods. We also conduct a small empirical study to verify some of the theoretical findings on a standard summarization benchmark.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2502.06861 [cs.LG]
	(or arXiv:2502.06861v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.06861

Submission history

From: Teodor Vanislavov Marinov [view email]
[v1] Sat, 8 Feb 2025 00:01:37 UTC (2,317 KB)

Computer Science > Machine Learning

Title:Design Considerations in Offline Preference-based RL

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Design Considerations in Offline Preference-based RL

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators