RSG: Beating SG without Smoothness and/or Strong Convexity

Yang, Tianbao; Lin, Qihang

Mathematics > Optimization and Control

arXiv:1512.03107v8 (math)

[Submitted on 9 Dec 2015 (v1), revised 5 Apr 2016 (this version, v8), latest version 12 Nov 2018 (v14)]

Title:RSG: Beating SG without Smoothness and/or Strong Convexity

Authors:Tianbao Yang, Qihang Lin

View PDF

Abstract:In this paper, we propose novel deterministic and stochastic {\bf R}estarted {\bf S}ub{\bf G}radient (RSG) methods that can find an $\epsilon$-optimal solution for a broad class of non-smooth and/or non-strongly convex optimization problems faster than the vanilla deterministic or stochastic subgradient method (SG). We show that for non-smooth and non-strongly convex optimization, RSG can reduce the dependence of SG's iteration complexity on the distance to the optimal set of the initial solution to that of points on the $\epsilon$-level set. For a special family of non-smooth and non-strongly convex optimization problems whose epigraph is a polyhedron, we further show that RSG could converge linearly. In addition, RSG has an $O(\frac{1}{\epsilon}\log(\frac{1}{\epsilon}))$ iteration complexity for problems with a much weaker notion of strong convexity, namely locally semi-strongly convexity. For a family of non-smooth optimization problems that admit a local Kurdyka-Łojasiewicz property with a power constant of $\beta\in(0,1)$, RSG has an $O(\frac{1}{\epsilon^{2\beta}}\log(\frac{1}{\epsilon}))$ iteration complexity, which is better than that of SG for such optimization problems whose iteration complexity is $O(\frac{1}{\epsilon^2})$. The novelty of our analysis lies at exploiting the lower bound of the subgradient of the objective function at the $\epsilon$-level set. It is this novelty that allows us to explore the local properties of functions (e.g., local semi-strong convexity, local Kurdyka-Łojasiewicz property, more generally local error bounds) to develop improved convergence of RSG.

Subjects:	Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:1512.03107 [math.OC]
	(or arXiv:1512.03107v8 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.1512.03107

Submission history

From: Tianbao Yang [view email]
[v1] Wed, 9 Dec 2015 22:58:21 UTC (58 KB)
[v2] Sun, 13 Dec 2015 23:40:38 UTC (165 KB)
[v3] Mon, 4 Jan 2016 20:36:11 UTC (165 KB)
[v4] Tue, 5 Jan 2016 03:33:59 UTC (165 KB)
[v5] Wed, 3 Feb 2016 05:56:29 UTC (262 KB)
[v6] Mon, 29 Feb 2016 05:37:18 UTC (264 KB)
[v7] Fri, 4 Mar 2016 23:57:06 UTC (264 KB)
[v8] Tue, 5 Apr 2016 03:40:59 UTC (265 KB)
[v9] Wed, 4 May 2016 04:07:35 UTC (277 KB)
[v10] Thu, 23 Jun 2016 05:08:01 UTC (277 KB)
[v11] Thu, 11 Aug 2016 23:09:32 UTC (172 KB)
[v12] Tue, 29 Nov 2016 06:12:35 UTC (174 KB)
[v13] Wed, 18 Apr 2018 20:57:48 UTC (206 KB)
[v14] Mon, 12 Nov 2018 05:23:28 UTC (206 KB)

Mathematics > Optimization and Control

Title:RSG: Beating SG without Smoothness and/or Strong Convexity

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:RSG: Beating SG without Smoothness and/or Strong Convexity

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators