Improving Line Search Methods for Large Scale Neural Network Training

Kenneweg, Philip; Kenneweg, Tristan; Hammer, Barbara

doi:10.1109/ACDSA59508.2024.10467724

Computer Science > Machine Learning

arXiv:2403.18519 (cs)

[Submitted on 27 Mar 2024]

Title:Improving Line Search Methods for Large Scale Neural Network Training

Authors:Philip Kenneweg, Tristan Kenneweg, Barbara Hammer

View PDF HTML (experimental)

Abstract:In recent studies, line search methods have shown significant improvements in the performance of traditional stochastic gradient descent techniques, eliminating the need for a specific learning rate schedule. In this paper, we identify existing issues in state-of-the-art line search methods, propose enhancements, and rigorously evaluate their effectiveness. We test these methods on larger datasets and more complex data domains than before. Specifically, we improve the Armijo line search by integrating the momentum term from ADAM in its search direction, enabling efficient large-scale training, a task that was previously prone to failure using Armijo line search methods. Our optimization approach outperforms both the previous Armijo implementation and tuned learning rate schedules for Adam. Our evaluation focuses on Transformers and CNNs in the domains of NLP and image data. Our work is publicly available as a Python package, which provides a hyperparameter free Pytorch optimizer.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.18519 [cs.LG]
	(or arXiv:2403.18519v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2403.18519
Related DOI:	https://doi.org/10.1109/ACDSA59508.2024.10467724

Submission history

From: Philip Kenneweg [view email]
[v1] Wed, 27 Mar 2024 12:50:27 UTC (17,916 KB)

Computer Science > Machine Learning

Title:Improving Line Search Methods for Large Scale Neural Network Training

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Improving Line Search Methods for Large Scale Neural Network Training

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators