From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

Yoshino, Hajime

doi:10.21468/SciPostPhysCore.2.2.005

Condensed Matter > Disordered Systems and Neural Networks

arXiv:1910.09918 (cond-mat)

[Submitted on 22 Oct 2019 (v1), last revised 10 Apr 2020 (this version, v4)]

Title:From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

Authors:Hajime Yoshino

View PDF

We develop a statistical mechanical approach based on the replica method to study the design space of deep and wide neural networks constrained to meet a large number of training data. Specifically, we analyze the configuration space of the synaptic weights and neurons in the hidden layers in a simple feed-forward perceptron network for two scenarios: a setting with random inputs/outputs and a teacher-student setting. By increasing the strength of constraints,~i.e. increasing the number of training data, successive 2nd order glass transition (random inputs/outputs) or 2nd order crystalline transition (teacher-student setting) take place layer-by-layer starting next to the inputs/outputs boundaries going deeper into the bulk with the thickness of the solid phase growing logarithmically with the data size. This implies the typical storage capacity of the network grows exponentially fast with the depth. In a deep enough network, the central part remains in the liquid phase. We argue that in systems of finite width N, the weak bias field can remain in the center and plays the role of a symmetry-breaking field that connects the opposite sides of the system. The successive glass transitions bring about a hierarchical free-energy landscape with ultrametricity, which evolves in space: it is most complex close to the boundaries but becomes renormalized into progressively simpler ones in deeper layers. These observations provide clues to understand why deep neural networks operate efficiently. Finally, we present some numerical simulations of learning which reveal spatially heterogeneous glassy dynamics truncated by a finite width $N$ effect.

Comments:	61 pages, 20 figures, revised version, to appear in SciPost Phys Core
Subjects:	Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1910.09918 [cond-mat.dis-nn]
	(or arXiv:1910.09918v4 [cond-mat.dis-nn] for this version)
	https://doi.org/10.48550/arXiv.1910.09918
Journal reference:	SciPost Phys. Core 2, 005 (2020)
Related DOI:	https://doi.org/10.21468/SciPostPhysCore.2.2.005

Submission history

From: Hajime Yoshino [view email]
[v1] Tue, 22 Oct 2019 12:21:06 UTC (1,814 KB)
[v2] Mon, 10 Feb 2020 15:20:48 UTC (2,071 KB)
[v3] Thu, 19 Mar 2020 04:22:32 UTC (2,207 KB)
[v4] Fri, 10 Apr 2020 02:30:29 UTC (2,207 KB)

Condensed Matter > Disordered Systems and Neural Networks

Title:From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Condensed Matter > Disordered Systems and Neural Networks

Title:From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators