Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

Bresler, Guy; Nagaraj, Dheeraj

Statistics > Machine Learning

arXiv:2006.04048v2 (stat)

[Submitted on 7 Jun 2020 (v1), last revised 21 Feb 2021 (this version, v2)]

Title:Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

Authors:Guy Bresler, Dheeraj Nagaraj

View PDF

Abstract:We prove sharp dimension-free representation results for neural networks with $D$ ReLU layers under square loss for a class of functions $\mathcal{G}_D$ defined in the paper. These results capture the precise benefits of depth in the following sense:
1. The rates for representing the class of functions $\mathcal{G}_D$ via $D$ ReLU layers is sharp up to constants, as shown by matching lower bounds.
2. For each $D$, $\mathcal{G}_{D} \subseteq \mathcal{G}_{D+1}$ and as $D$ grows the class of functions $\mathcal{G}_{D}$ contains progressively less smooth functions.
3. If $D^{\prime} < D$, then the approximation rate for the class $\mathcal{G}_D$ achieved by depth $D^{\prime}$ networks is strictly worse than that achieved by depth $D$ networks.
This constitutes a fine-grained characterization of the representation power of feedforward networks of arbitrary depth $D$ and number of neurons $N$, in contrast to existing representation results which either require $D$ growing quickly with $N$ or assume that the function being represented is highly smooth. In the latter case similar rates can be obtained with a single nonlinear layer. Our results confirm the prevailing hypothesis that deeper networks are better at representing less smooth functions, and indeed, the main technical novelty is to fully exploit the fact that deep networks can produce highly oscillatory functions with few activation functions.

Comments:	12 pages, 1 figure (surprisingly short isn't it?)
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2006.04048 [stat.ML]
	(or arXiv:2006.04048v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2006.04048

Submission history

From: Dheeraj Nagaraj [view email]
[v1] Sun, 7 Jun 2020 05:25:06 UTC (631 KB)
[v2] Sun, 21 Feb 2021 21:51:01 UTC (632 KB)

Statistics > Machine Learning

Title:Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators