Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Melo, Luckeciano C.; Tigas, Panagiotis; Abate, Alessandro; Gal, Yarin

Computer Science > Machine Learning

arXiv:2406.10023 (cs)

[Submitted on 14 Jun 2024 (v1), last revised 28 Oct 2024 (this version, v2)]

Title:Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Authors:Luckeciano C. Melo, Panagiotis Tigas, Alessandro Abate, Yarin Gal

View PDF HTML (experimental)

Abstract:Leveraging human preferences for steering the behavior of Large Language Models (LLMs) has demonstrated notable success in recent years. Nonetheless, data selection and labeling are still a bottleneck for these systems, particularly at large scale. Hence, selecting the most informative points for acquiring human feedback may considerably reduce the cost of preference labeling and unleash the further development of LLMs. Bayesian Active Learning provides a principled framework for addressing this challenge and has demonstrated remarkable success in diverse settings. However, previous attempts to employ it for Preference Modeling did not meet such expectations. In this work, we identify that naive epistemic uncertainty estimation leads to the acquisition of redundant samples. We address this by proposing the Bayesian Active Learner for Preference Modeling (BAL-PM), a novel stochastic acquisition policy that not only targets points of high epistemic uncertainty according to the preference model but also seeks to maximize the entropy of the acquired prompt distribution in the feature space spanned by the employed LLM. Notably, our experiments demonstrate that BAL-PM requires 33% to 68% fewer preference labels in two popular human preference datasets and exceeds previous stochastic Bayesian acquisition policies.

Comments:	38th Conference on Neural Information Processing Systems (NeurIPS 2024)
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Machine Learning (stat.ML)
Cite as:	arXiv:2406.10023 [cs.LG]
	(or arXiv:2406.10023v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2406.10023

Submission history

From: Luckeciano Melo [view email]
[v1] Fri, 14 Jun 2024 13:32:43 UTC (5,476 KB)
[v2] Mon, 28 Oct 2024 14:25:35 UTC (5,583 KB)

Computer Science > Machine Learning

Title:Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators