Finite Sample Identification of Wide Shallow Neural Networks with Biases

Fornasier, Massimo; Klock, Timo; Mondelli, Marco; Rauchensteiner, Michael

Computer Science > Machine Learning

arXiv:2211.04589 (cs)

[Submitted on 8 Nov 2022]

Title:Finite Sample Identification of Wide Shallow Neural Networks with Biases

Authors:Massimo Fornasier, Timo Klock, Marco Mondelli, Michael Rauchensteiner

View PDF

Abstract:Artificial neural networks are functions depending on a finite number of parameters typically encoded as weights and biases. The identification of the parameters of the network from finite samples of input-output pairs is often referred to as the \emph{teacher-student model}, and this model has represented a popular framework for understanding training and generalization. Even if the problem is NP-complete in the worst case, a rapidly growing literature -- after adding suitable distributional assumptions -- has established finite sample identification of two-layer networks with a number of neurons $m=\mathcal O(D)$, $D$ being the input dimension. For the range $D<m<D^2$ the problem becomes harder, and truly little is known for networks parametrized by biases as well. This paper fills the gap by providing constructive methods and theoretical guarantees of finite sample identification for such wider shallow networks with biases. Our approach is based on a two-step pipeline: first, we recover the direction of the weights, by exploiting second order information; next, we identify the signs by suitable algebraic evaluations, and we recover the biases by empirical risk minimization via gradient descent. Numerical results demonstrate the effectiveness of our approach.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
MSC classes:	65D15, 68T07, 90C26
Cite as:	arXiv:2211.04589 [cs.LG]
	(or arXiv:2211.04589v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2211.04589

Submission history

From: Michael Rauchensteiner [view email]
[v1] Tue, 8 Nov 2022 22:10:32 UTC (941 KB)

Computer Science > Machine Learning

Title:Finite Sample Identification of Wide Shallow Neural Networks with Biases

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Finite Sample Identification of Wide Shallow Neural Networks with Biases

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators