Almost Sure Convergence Rates of Stochastic Zeroth-order Gradient Descent for \L ojasiewicz Functions

Wang, Tianyu

Mathematics > Optimization and Control

arXiv:2210.16997v2 (math)

[Submitted on 31 Oct 2022 (v1), revised 14 Nov 2022 (this version, v2), latest version 19 Apr 2023 (v6)]

Title:Almost Sure Convergence Rates of Stochastic Zeroth-order Gradient Descent for Łojasiewicz Functions

Authors:Tianyu Wang

View PDF

Abstract:We prove \emph{almost sure convergence rates} of Stochastic Zeroth-order Gradient Descent (SZGD) algorithms for Łojasiewicz functions. The SZGD algorithm iterates as \begin{align*}
x_{t+1} = x_t - \eta_t \widehat{\nabla} f (x_t), \qquad t = 0,1,2,3,\cdots , \end{align*} where $f$ is the objective function that satisfies the Łojasiewicz inequality with Łojasiewicz exponent $\theta$, $\eta_t$ is the step size (learning rate), and $ \widehat{\nabla} f (x_t) $ is the approximate gradient estimated using zeroth-order information. We show that, for {smooth} Łojasiewicz functions, the sequence $\{ x_t \}_{t\in\mathbb{N}}$ generated by SZGD converges to a bounded point $x_\infty$ almost surely, and $x_\infty$ is a critical point of $f$. If $\theta \in (0,\frac{1}{2}]$, $ f (x_t) - f (x_\infty) $, $ \sum_{s=t}^\infty \| x_{s+1} - x_{s} \|^2$ and $ \| x_t - x_\infty \| $ ($\| \cdot \|$ is the Euclidean norm) converge to zero \emph{linearly almost surely}. If $\theta \in (\frac{1}{2}, 1)$, then $ f (x_t) - f (x_\infty) $ (and $ \sum_{s=t}^\infty \| x_{s+1} - x_s \|^2 $) converges to zero at rate $O \left( t^{\frac{1}{1 - 2\theta}} \right) $ almost surely; $ \| x_{t} - x_\infty \| $ converges to zero at rate $O \left( t^{\frac{1-\theta}{1-2\theta}} \right) $ almost surely. To the best of our knowledge, this paper provides the first \emph{almost sure convergence rate} guarantee for stochastic zeroth order algorithms for Łojasiewicz functions.

Comments:	An embarassing error in Section 5 is fixed. Some rates are (slightly) improved
Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG)
Cite as:	arXiv:2210.16997 [math.OC]
	(or arXiv:2210.16997v2 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2210.16997

Submission history

From: Tianyu Wang [view email]
[v1] Mon, 31 Oct 2022 00:53:17 UTC (668 KB)
[v2] Mon, 14 Nov 2022 05:52:39 UTC (669 KB)
[v3] Mon, 27 Feb 2023 23:57:26 UTC (1,405 KB)
[v4] Thu, 9 Mar 2023 00:30:56 UTC (1,404 KB)
[v5] Mon, 20 Mar 2023 14:18:13 UTC (1,406 KB)
[v6] Wed, 19 Apr 2023 12:20:47 UTC (1,406 KB)

Mathematics > Optimization and Control

Title:Almost Sure Convergence Rates of Stochastic Zeroth-order Gradient Descent for Łojasiewicz Functions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Almost Sure Convergence Rates of Stochastic Zeroth-order Gradient Descent for Łojasiewicz Functions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators