Large Language Models to Enhance Bayesian Optimization

Liu, Tennison; Astorga, Nicolás; Seedat, Nabeel; van der Schaar, Mihaela

Computer Science > Machine Learning

arXiv:2402.03921 (cs)

[Submitted on 6 Feb 2024 (v1), last revised 8 Mar 2024 (this version, v2)]

Title:Large Language Models to Enhance Bayesian Optimization

Authors:Tennison Liu, Nicolás Astorga, Nabeel Seedat, Mihaela van der Schaar

View PDF

Abstract:Bayesian optimization (BO) is a powerful approach for optimizing complex and expensive-to-evaluate black-box functions. Its importance is underscored in many applications, notably including hyperparameter tuning, but its efficacy depends on efficiently balancing exploration and exploitation. While there has been substantial progress in BO methods, striking this balance remains a delicate process. In this light, we present LLAMBO, a novel approach that integrates the capabilities of Large Language Models (LLM) within BO. At a high level, we frame the BO problem in natural language, enabling LLMs to iteratively propose and evaluate promising solutions conditioned on historical evaluations. More specifically, we explore how combining contextual understanding, few-shot learning proficiency, and domain knowledge of LLMs can improve model-based BO. Our findings illustrate that LLAMBO is effective at zero-shot warmstarting, and enhances surrogate modeling and candidate sampling, especially in the early stages of search when observations are sparse. Our approach is performed in context and does not require LLM finetuning. Additionally, it is modular by design, allowing individual components to be integrated into existing BO frameworks, or function cohesively as an end-to-end method. We empirically validate LLAMBO's efficacy on the problem of hyperparameter tuning, highlighting strong empirical performance across a range of diverse benchmarks, proprietary, and synthetic tasks.

Comments:	Accepted as Poster at ICLR2024
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.03921 [cs.LG]
	(or arXiv:2402.03921v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.03921

Submission history

From: Tennison Liu [view email]
[v1] Tue, 6 Feb 2024 11:44:06 UTC (9,596 KB)
[v2] Fri, 8 Mar 2024 12:23:56 UTC (13,869 KB)

Computer Science > Machine Learning

Title:Large Language Models to Enhance Bayesian Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Large Language Models to Enhance Bayesian Optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators