Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs

Seroussi, Inbar; Naveh, Gadi; Ringel, Zohar

Statistics > Machine Learning

arXiv:2112.15383v2 (stat)

[Submitted on 31 Dec 2021 (v1), revised 23 Feb 2022 (this version, v2), latest version 22 Sep 2022 (v3)]

Title:Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs

Authors:Inbar Seroussi, Gadi Naveh, Zohar Ringel

View PDF

Abstract:Deep neural networks (DNNs) are powerful tools for compressing and distilling information. Their scale and complexity, often involving billions of inter-dependent internal degrees of freedom, renders direct microscopic analysis difficult. Under such circumstances, a common strategy is to identify slow degrees of freedom that average out the erratic behavior of the underlying fast microscopic variables. Here, we identify such a separation of scales occurring in fully trained over-parameterized deep convolutional neural networks (CNNs). Specifically, we show that DNN layers couple only through the second moment (kernels) of their activations and pre-activations. Moreover, in various settings, the latter fluctuate in a nearly Gaussian manner. For CNNs with infinitely many channels, these kernels are inert, while for finite CNNs they adapt to the data. In several deep non-linear CNN models trained on real data, the resulting thermodynamic theory of deep learning yields accurate predictions. In addition, it provides new ways of analyzing and understanding CNNs, and DNNs in general.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Data Analysis, Statistics and Probability (physics.data-an)
Cite as:	arXiv:2112.15383 [stat.ML]
	(or arXiv:2112.15383v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2112.15383

Submission history

From: Inbar Seroussi [view email]
[v1] Fri, 31 Dec 2021 10:49:55 UTC (1,080 KB)
[v2] Wed, 23 Feb 2022 21:31:46 UTC (12,145 KB)
[v3] Thu, 22 Sep 2022 21:42:18 UTC (7,515 KB)

Statistics > Machine Learning

Title:Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators