Prodigy: An Expeditiously Adaptive Parameter-Free Learner

Mishchenko, Konstantin; Defazio, Aaron

Computer Science > Machine Learning

arXiv:2306.06101 (cs)

[Submitted on 9 Jun 2023 (v1), last revised 19 Mar 2024 (this version, v4)]

Title:Prodigy: An Expeditiously Adaptive Parameter-Free Learner

Authors:Konstantin Mishchenko, Aaron Defazio

View PDF HTML (experimental)

Abstract:We consider the problem of estimating the learning rate in adaptive methods, such as AdaGrad and Adam. We propose Prodigy, an algorithm that provably estimates the distance to the solution $D$, which is needed to set the learning rate optimally. At its core, Prodigy is a modification of the D-Adaptation method for learning-rate-free learning. It improves upon the convergence rate of D-Adaptation by a factor of $O(\sqrt{\log(D/d_0)})$, where $d_0$ is the initial estimate of $D$. We test Prodigy on 12 common logistic-regression benchmark datasets, VGG11 and ResNet-50 training on CIFAR10, ViT training on Imagenet, LSTM training on IWSLT14, DLRM training on Criteo dataset, VarNet on Knee MRI dataset, as well as RoBERTa and GPT transformer training on BookWiki. Our experimental results show that our approach consistently outperforms D-Adaptation and reaches test accuracy values close to that of hand-tuned Adam.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:2306.06101 [cs.LG]
	(or arXiv:2306.06101v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.06101

Submission history

From: Konstantin Mishchenko [view email]
[v1] Fri, 9 Jun 2023 17:59:35 UTC (3,066 KB)
[v2] Thu, 21 Sep 2023 16:29:31 UTC (4,378 KB)
[v3] Sun, 29 Oct 2023 15:08:03 UTC (4,377 KB)
[v4] Tue, 19 Mar 2024 23:01:06 UTC (4,028 KB)

Computer Science > Machine Learning

Title:Prodigy: An Expeditiously Adaptive Parameter-Free Learner

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Prodigy: An Expeditiously Adaptive Parameter-Free Learner

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators