Tune As You Scale: Hyperparameter Optimization For Compute Efficient Training

Fetterman, Abraham J.; Kitanidis, Ellie; Albrecht, Joshua; Polizzi, Zachary; Fogelman, Bryden; Knutins, Maksis; Wróblewski, Bartosz; Simon, James B.; Qiu, Kanjun

Computer Science > Machine Learning

arXiv:2306.08055 (cs)

[Submitted on 13 Jun 2023]

Title:Tune As You Scale: Hyperparameter Optimization For Compute Efficient Training

Authors:Abraham J. Fetterman, Ellie Kitanidis, Joshua Albrecht, Zachary Polizzi, Bryden Fogelman, Maksis Knutins, Bartosz Wróblewski, James B. Simon, Kanjun Qiu

View PDF

Abstract:Hyperparameter tuning of deep learning models can lead to order-of-magnitude performance gains for the same amount of compute. Despite this, systematic tuning is uncommon, particularly for large models, which are expensive to evaluate and tend to have many hyperparameters, necessitating difficult judgment calls about tradeoffs, budgets, and search bounds. To address these issues and propose a practical method for robustly tuning large models, we present Cost-Aware Pareto Region Bayesian Search (CARBS), a Bayesian optimization algorithm that performs local search around the performance-cost Pareto frontier. CARBS does well even in unbounded search spaces with many hyperparameters, learns scaling relationships so that it can tune models even as they are scaled up, and automates much of the "black magic" of tuning. Among our results, we effectively solve the entire ProcGen benchmark just by tuning a simple baseline (PPO, as provided in the original ProcGen paper). We also reproduce the model size vs. training tokens scaling result from the Chinchilla project (Hoffmann et al. 2022), while simultaneously discovering scaling laws for every other hyperparameter, via an easy automated process that uses significantly less compute and is applicable to any deep learning problem (not just language models).

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2306.08055 [cs.LG]
	(or arXiv:2306.08055v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.08055

Submission history

From: Ellie Kitanidis [view email]
[v1] Tue, 13 Jun 2023 18:22:24 UTC (4,755 KB)

Computer Science > Machine Learning

Title:Tune As You Scale: Hyperparameter Optimization For Compute Efficient Training

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Tune As You Scale: Hyperparameter Optimization For Compute Efficient Training

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators