Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA

Li, Chun-Liang; Lin, Hsuan-Tien; Lu, Chi-Jen

Statistics > Machine Learning

arXiv:1506.01490 (stat)

[Submitted on 4 Jun 2015 (v1), last revised 12 Oct 2015 (this version, v2)]

Title:Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA

Authors:Chun-Liang Li, Hsuan-Tien Lin, Chi-Jen Lu

View PDF

Abstract:We study the problem of recovering the subspace spanned by the first $k$ principal components of $d$-dimensional data under the streaming setting, with a memory bound of $O(kd)$. Two families of algorithms are known for this problem. The first family is based on the framework of stochastic gradient descent. Nevertheless, the convergence rate of the family can be seriously affected by the learning rate of the descent steps and deserves more serious study. The second family is based on the power method over blocks of data, but setting the block size for its existing algorithms is not an easy task. In this paper, we analyze the convergence rate of a representative algorithm with decayed learning rate (Oja and Karhunen, 1985) in the first family for the general $k>1$ case. Moreover, we propose a novel algorithm for the second family that sets the block sizes automatically and dynamically with faster convergence rate. We then conduct empirical studies that fairly compare the two families on real-world data. The studies reveal the advantages and disadvantages of these two families.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1506.01490 [stat.ML]
	(or arXiv:1506.01490v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1506.01490

Submission history

From: Chun-Liang Li [view email]
[v1] Thu, 4 Jun 2015 07:36:57 UTC (200 KB)
[v2] Mon, 12 Oct 2015 02:19:30 UTC (248 KB)

Statistics > Machine Learning

Title:Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators