Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic Regression

Munteanu, Alexander; Omlor, Simon; Woodruff, David

Computer Science > Data Structures and Algorithms

arXiv:2304.00051 (cs)

[Submitted on 31 Mar 2023]

Title:Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic Regression

Authors:Alexander Munteanu, Simon Omlor, David Woodruff

View PDF

Abstract:We improve upon previous oblivious sketching and turnstile streaming results for $\ell_1$ and logistic regression, giving a much smaller sketching dimension achieving $O(1)$-approximation and yielding an efficient optimization problem in the sketch space. Namely, we achieve for any constant $c>0$ a sketching dimension of $\tilde{O}(d^{1+c})$ for $\ell_1$ regression and $\tilde{O}(\mu d^{1+c})$ for logistic regression, where $\mu$ is a standard measure that captures the complexity of compressing the data. For $\ell_1$-regression our sketching dimension is near-linear and improves previous work which either required $\Omega(\log d)$-approximation with this sketching dimension, or required a larger $\operatorname{poly}(d)$ number of rows. Similarly, for logistic regression previous work had worse $\operatorname{poly}(\mu d)$ factors in its sketching dimension. We also give a tradeoff that yields a $1+\varepsilon$ approximation in input sparsity time by increasing the total size to $(d\log(n)/\varepsilon)^{O(1/\varepsilon)}$ for $\ell_1$ and to $(\mu d\log(n)/\varepsilon)^{O(1/\varepsilon)}$ for logistic regression. Finally, we show that our sketch can be extended to approximate a regularized version of logistic regression where the data-dependent regularizer corresponds to the variance of the individual logistic losses.

Comments:	ICLR 2023
Subjects:	Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2304.00051 [cs.DS]
	(or arXiv:2304.00051v1 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.2304.00051

Submission history

From: Simon Omlor [view email]
[v1] Fri, 31 Mar 2023 18:12:33 UTC (3,224 KB)

Computer Science > Data Structures and Algorithms

Title:Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic Regression

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic Regression

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators