Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

Wang, Yifei; Hua, Yixuan; Candés, Emmanuel; Pilanci, Mert

Computer Science > Machine Learning

arXiv:2209.15265 (cs)

[Submitted on 30 Sep 2022 (v1), last revised 17 Feb 2023 (this version, v3)]

Title:Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

Authors:Yifei Wang, Yixuan Hua, Emmanuel Candés, Mert Pilanci

View PDF

Abstract:The practice of deep learning has shown that neural networks generalize remarkably well even with an extreme number of learned parameters. This appears to contradict traditional statistical wisdom, in which a trade-off between model complexity and fit to the data is essential. We aim to address this discrepancy by adopting a convex optimization and sparse recovery perspective. We consider the training and generalization properties of two-layer ReLU networks with standard weight decay regularization. Under certain regularity assumptions on the data, we show that ReLU networks with an arbitrary number of parameters learn only simple models that explain the data. This is analogous to the recovery of the sparsest linear model in compressed sensing. For ReLU networks and their variants with skip connections or normalization layers, we present isometry conditions that ensure the exact recovery of planted neurons. For randomly generated data, we show the existence of a phase transition in recovering planted neural network models, which is easy to describe: whenever the ratio between the number of samples and the dimension exceeds a numerical threshold, the recovery succeeds with high probability; otherwise, it fails with high probability. Surprisingly, ReLU networks learn simple and sparse models that generalize well even when the labels are noisy . The phase transition phenomenon is confirmed through numerical experiments.

Subjects:	Machine Learning (cs.LG); Information Theory (cs.IT); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:2209.15265 [cs.LG]
	(or arXiv:2209.15265v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2209.15265

Submission history

From: Yifei Wang [view email]
[v1] Fri, 30 Sep 2022 06:47:15 UTC (14,486 KB)
[v2] Tue, 4 Oct 2022 05:00:43 UTC (7,233 KB)
[v3] Fri, 17 Feb 2023 19:50:50 UTC (7,432 KB)

Computer Science > Machine Learning

Title:Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators