An Information-Theoretic Framework for Supervised Learning

Jeon, Hong Jun; Zhu, Yifan; Van Roy, Benjamin

Computer Science > Machine Learning

arXiv:2203.00246 (cs)

[Submitted on 1 Mar 2022 (v1), last revised 24 Mar 2023 (this version, v6)]

Title:An Information-Theoretic Framework for Supervised Learning

Authors:Hong Jun Jeon, Yifan Zhu, Benjamin Van Roy

View PDF

Abstract:Each year, deep learning demonstrates new and improved empirical results with deeper and wider neural networks. Meanwhile, with existing theoretical frameworks, it is difficult to analyze networks deeper than two layers without resorting to counting parameters or encountering sample complexity bounds that are exponential in depth. Perhaps it may be fruitful to try to analyze modern machine learning under a different lens. In this paper, we propose a novel information-theoretic framework with its own notions of regret and sample complexity for analyzing the data requirements of machine learning. With our framework, we first work through some classical examples such as scalar estimation and linear regression to build intuition and introduce general techniques. Then, we use the framework to study the sample complexity of learning from data generated by deep neural networks with ReLU activation units. For a particular prior distribution on weights, we establish sample complexity bounds that are simultaneously width independent and linear in depth. This prior distribution gives rise to high-dimensional latent representations that, with high probability, admit reasonably accurate low-dimensional approximations. We conclude by corroborating our theoretical results with experimental analysis of random single-hidden-layer neural networks.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2203.00246 [cs.LG]
	(or arXiv:2203.00246v6 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2203.00246

Submission history

From: Hong Jun Jeon [view email]
[v1] Tue, 1 Mar 2022 05:58:28 UTC (141 KB)
[v2] Fri, 4 Mar 2022 23:53:39 UTC (141 KB)
[v3] Thu, 7 Apr 2022 22:33:12 UTC (141 KB)
[v4] Sun, 22 May 2022 20:29:40 UTC (278 KB)
[v5] Tue, 7 Jun 2022 21:13:39 UTC (932 KB)
[v6] Fri, 24 Mar 2023 19:48:25 UTC (20,342 KB)

Computer Science > Machine Learning

Title:An Information-Theoretic Framework for Supervised Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:An Information-Theoretic Framework for Supervised Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators