Random Matrix Theory for Stochastic Gradient Descent

Park, Chanju; Favoni, Matteo; Lucini, Biagio; Aarts, Gert

High Energy Physics - Lattice

arXiv:2412.20496 (hep-lat)

[Submitted on 29 Dec 2024]

Title:Random Matrix Theory for Stochastic Gradient Descent

Authors:Chanju Park, Matteo Favoni, Biagio Lucini, Gert Aarts

View PDF HTML (experimental)

Abstract:Investigating the dynamics of learning in machine learning algorithms is of paramount importance for understanding how and why an approach may be successful. The tools of physics and statistics provide a robust setting for such investigations. Here we apply concepts from random matrix theory to describe stochastic weight matrix dynamics, using the framework of Dyson Brownian motion. We derive the linear scaling rule between the learning rate (step size) and the batch size, and identify universal and non-universal aspects of weight matrix dynamics. We test our findings in the (near-)solvable case of the Gaussian Restricted Boltzmann Machine and in a linear one-hidden-layer neural network.

Comments:	13 pages, 9 figures, Proceedings of the 41st International Symposium on Lattice Field Theory (Lattice 2024), July 28th - August 3rd, 2024, University of Liverpool, UK
Subjects:	High Energy Physics - Lattice (hep-lat); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (cs.LG)
Cite as:	arXiv:2412.20496 [hep-lat]
	(or arXiv:2412.20496v1 [hep-lat] for this version)
	https://doi.org/10.48550/arXiv.2412.20496

Submission history

From: Chanju Park [view email]
[v1] Sun, 29 Dec 2024 15:21:13 UTC (2,484 KB)

Full-text links:

Access Paper:

view license

Current browse context:

hep-lat

< prev | next >

new | recent | 2024-12

Change to browse by:

cond-mat
cond-mat.dis-nn
cs
cs.LG

References & Citations

export BibTeX citation

High Energy Physics - Lattice

Title:Random Matrix Theory for Stochastic Gradient Descent

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

High Energy Physics - Lattice

Title:Random Matrix Theory for Stochastic Gradient Descent

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators