Robust Testing in High-Dimensional Sparse Models

George, Anand Jerry; Canonne, Clément L.

Computer Science > Information Theory

arXiv:2205.07488 (cs)

[Submitted on 16 May 2022 (v1), last revised 4 Nov 2022 (this version, v2)]

Title:Robust Testing in High-Dimensional Sparse Models

Authors:Anand Jerry George, Clément L. Canonne

View PDF

Abstract:We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. In the first model, we are given $n$ i.i.d. samples from the distribution $\mathcal{N}\left(\theta,I_d\right)$ (with unknown $\theta$), of which a small fraction has been arbitrarily corrupted. Under the promise that $\|\theta\|_0\le s$, we want to correctly distinguish whether $\|\theta\|_2=0$ or $\|\theta\|_2>\gamma$, for some input parameter $\gamma>0$. We show that any algorithm for this task requires $n=\Omega\left(s\log\frac{ed}{s}\right)$ samples, which is tight up to logarithmic factors. We also extend our results to other common notions of sparsity, namely, $\|\theta\|_q\le s$ for any $0 < q < 2$. In the second observation model that we consider, the data is generated according to a sparse linear regression model, where the covariates are i.i.d. Gaussian and the regression coefficient (signal) is known to be $s$-sparse. Here too we assume that an $\epsilon$-fraction of the data is arbitrarily corrupted. We show that any algorithm that reliably tests the norm of the regression coefficient requires at least $n=\Omega\left(\min(s\log d,{1}/{\gamma^4})\right)$ samples. Our results show that the complexity of testing in these two settings significantly increases under robustness constraints. This is in line with the recent observations made in robust mean testing and robust covariance testing.

Comments:	Fixed typos, added a figure and discussion section
Subjects:	Information Theory (cs.IT); Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2205.07488 [cs.IT]
	(or arXiv:2205.07488v2 [cs.IT] for this version)
	https://doi.org/10.48550/arXiv.2205.07488

Submission history

From: Anand Jerry George [view email]
[v1] Mon, 16 May 2022 07:47:22 UTC (38 KB)
[v2] Fri, 4 Nov 2022 22:18:50 UTC (63 KB)

Computer Science > Information Theory

Title:Robust Testing in High-Dimensional Sparse Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Theory

Title:Robust Testing in High-Dimensional Sparse Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators