Mathematics > Optimization and Control
[Submitted on 3 Apr 2023 (v1), last revised 4 Apr 2023 (this version, v2)]
Title:Improved Convergence in High Probability of Clipped Gradient Methods with Heavy Tails
View PDFAbstract:In this work, we study the convergence \emph{in high probability} of clipped gradient methods when the noise distribution has heavy tails, ie., with bounded $p$th moments, for some $1<p\le2$. Prior works in this setting follow the same recipe of using concentration inequalities and an inductive argument with union bound to bound the iterates across all iterations. This method results in an increase in the failure probability by a factor of $T$, where $T$ is the number of iterations. We instead propose a new analysis approach based on bounding the moment generating function of a well chosen supermartingale sequence. We improve the dependency on $T$ in the convergence guarantee for a wide range of algorithms with clipped gradients, including stochastic (accelerated) mirror descent for convex objectives and stochastic gradient descent for nonconvex objectives. This approach naturally allows the algorithms to use time-varying step sizes and clipping parameters when the time horizon is unknown, which appears impossible in prior works. We show that in the case of clipped stochastic mirror descent, problem constants, including the initial distance to the optimum, are not required when setting step sizes and clipping parameters.
Submission history
From: Ta Duy Nguyen [view email][v1] Mon, 3 Apr 2023 16:34:11 UTC (22 KB)
[v2] Tue, 4 Apr 2023 17:27:53 UTC (22 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.