Towards stability and optimality in stochastic gradient descent

Toulis, Panos; Tran, Dustin; Airoldi, Edoardo M.

Statistics > Methodology

arXiv:1505.02417 (stat)

[Submitted on 10 May 2015 (v1), last revised 7 Jun 2016 (this version, v4)]

Title:Towards stability and optimality in stochastic gradient descent

Authors:Panos Toulis, Dustin Tran, Edoardo M. Airoldi

View PDF

Abstract:Iterative procedures for parameter estimation based on stochastic gradient descent allow the estimation to scale to massive data sets. However, in both theory and practice, they suffer from numerical instability. Moreover, they are statistically inefficient as estimators of the true parameter value. To address these two issues, we propose a new iterative procedure termed averaged implicit SGD (AI-SGD). For statistical efficiency, AI-SGD employs averaging of the iterates, which achieves the optimal Cramér-Rao bound under strong convexity, i.e., it is an optimal unbiased estimator of the true parameter value. For numerical stability, AI-SGD employs an implicit update at each iteration, which is related to proximal operators in optimization. In practice, AI-SGD achieves competitive performance with other state-of-the-art procedures. Furthermore, it is more stable than averaging procedures that do not employ proximal updates, and is simple to implement as it requires fewer tunable hyperparameters than procedures that do employ proximal updates.

Comments:	Appears in Artificial Intelligence and Statistics, 2016
Subjects:	Methodology (stat.ME); Machine Learning (cs.LG); Computation (stat.CO); Machine Learning (stat.ML)
Cite as:	arXiv:1505.02417 [stat.ME]
	(or arXiv:1505.02417v4 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.1505.02417

Submission history

From: Dustin Tran [view email]
[v1] Sun, 10 May 2015 18:10:07 UTC (54 KB)
[v2] Tue, 20 Oct 2015 03:01:53 UTC (102 KB)
[v3] Fri, 3 Jun 2016 23:11:21 UTC (96 KB)
[v4] Tue, 7 Jun 2016 04:02:43 UTC (96 KB)

Statistics > Methodology

Title:Towards stability and optimality in stochastic gradient descent

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title:Towards stability and optimality in stochastic gradient descent

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators