Accelerating Neural Network Training: An Analysis of the AlgoPerf Competition

Kasimbeg, Priya; Schneider, Frank; Eschenhagen, Runa; Bae, Juhan; Sastry, Chandramouli Shama; Saroufim, Mark; Feng, Boyuan; Wright, Less; Yang, Edward Z.; Nado, Zachary; Medapati, Sourabh; Hennig, Philipp; Rabbat, Michael; Dahl, George E.

Computer Science > Machine Learning

arXiv:2502.15015 (cs)

[Submitted on 20 Feb 2025]

Title:Accelerating Neural Network Training: An Analysis of the AlgoPerf Competition

Authors:Priya Kasimbeg, Frank Schneider, Runa Eschenhagen, Juhan Bae, Chandramouli Shama Sastry, Mark Saroufim, Boyuan Feng, Less Wright, Edward Z. Yang, Zachary Nado, Sourabh Medapati, Philipp Hennig, Michael Rabbat, George E. Dahl

View PDF

Abstract:The goal of the AlgoPerf: Training Algorithms competition is to evaluate practical speed-ups in neural network training achieved solely by improving the underlying training algorithms. In the external tuning ruleset, submissions must provide workload-agnostic hyperparameter search spaces, while in the self-tuning ruleset they must be completely hyperparameter-free. In both rulesets, submissions are compared on time-to-result across multiple deep learning workloads, training on fixed hardware. This paper presents the inaugural AlgoPerf competition's results, which drew 18 diverse submissions from 10 teams. Our investigation reveals several key findings: (1) The winning submission in the external tuning ruleset, using Distributed Shampoo, demonstrates the effectiveness of non-diagonal preconditioning over popular methods like Adam, even when compared on wall-clock runtime. (2) The winning submission in the self-tuning ruleset, based on the Schedule Free AdamW algorithm, demonstrates a new level of effectiveness for completely hyperparameter-free training algorithms. (3) The top-scoring submissions were surprisingly robust to workload changes. We also discuss the engineering challenges encountered in ensuring a fair comparison between different training algorithms. These results highlight both the significant progress so far, and the considerable room for further improvements.

Comments:	ICLR 2025; 23 pages, 5 figures, 8 tables
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2502.15015 [cs.LG]
	(or arXiv:2502.15015v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.15015

Submission history

From: Frank Schneider [view email]
[v1] Thu, 20 Feb 2025 20:11:54 UTC (1,635 KB)

Computer Science > Machine Learning

Title:Accelerating Neural Network Training: An Analysis of the AlgoPerf Competition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Accelerating Neural Network Training: An Analysis of the AlgoPerf Competition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators