Learning-rate-free Momentum SGD with Reshuffling Converges in Nonsmooth Nonconvex Optimization

Hu, Xiaoyin; Xiao, Nachuan; Liu, Xin; Toh, Kim-Chuan

Mathematics > Optimization and Control

arXiv:2406.18287 (math)

[Submitted on 26 Jun 2024]

Title:Learning-rate-free Momentum SGD with Reshuffling Converges in Nonsmooth Nonconvex Optimization

Authors:Xiaoyin Hu, Nachuan Xiao, Xin Liu, Kim-Chuan Toh

View PDF HTML (experimental)

Abstract:In this paper, we propose a generalized framework for developing learning-rate-free momentum stochastic gradient descent (SGD) methods in the minimization of nonsmooth nonconvex functions, especially in training nonsmooth neural networks. Our framework adaptively generates learning rates based on the historical data of stochastic subgradients and iterates. Under mild conditions, we prove that our proposed framework enjoys global convergence to the stationary points of the objective function in the sense of the conservative field, hence providing convergence guarantees for training nonsmooth neural networks. Based on our proposed framework, we propose a novel learning-rate-free momentum SGD method (LFM). Preliminary numerical experiments reveal that LFM performs comparably to the state-of-the-art learning-rate-free methods (which have not been shown theoretically to be convergence) across well-known neural network training benchmarks.

Comments:	26 pages
Subjects:	Optimization and Control (math.OC)
Cite as:	arXiv:2406.18287 [math.OC]
	(or arXiv:2406.18287v1 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2406.18287

Submission history

From: Nachuan Xiao [view email]
[v1] Wed, 26 Jun 2024 12:15:45 UTC (677 KB)

Mathematics > Optimization and Control

Title:Learning-rate-free Momentum SGD with Reshuffling Converges in Nonsmooth Nonconvex Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Learning-rate-free Momentum SGD with Reshuffling Converges in Nonsmooth Nonconvex Optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators