Glassy dynamics near the learnability transition in deep recurrent networks

Hertz, John; Tyrcha, Joanna

Abstract:We examine learning dynamics in deep recurrent networks, focusing on the behavior near the learnability transition. The training data are Bach chorales in 4-part harmony, and the learning is by stochastic gradient descent. The negative log-likelihood exhibits power-law decay at long learning times, with a power that depends on depth (the number of layers) d and width (the number of hidden units per of layer) w. When the network is underparametrized (too small to learn the data), the power law approach is to a positive asymptotic value. We find that, for a given depth, the learning time appears to diverge proportional to 1/(w - w_c) as w approaches a critical value w_c from above. w_c is a decreasing function of the number of layers and the number of hidden units per layer. We also study aging dynamics (the slowing-down of fluctuations as the time since the beginning of learning grows). We consider a system that has been learning for a time tau_w and measure the fluctuations of the weight values in a time interval of length tau after tau_w. In the underparametrized phase, we find that they are well-described by a single function of tau/tau_w, independent of tau_w, consistent with the weak ergodicity breaking seen frequently in glassy systems. This scaling persists for short times in the overparametrized phase but breaks down at long times.

Comments:	12 pages, 9 figures
Subjects:	Disordered Systems and Neural Networks (cond-mat.dis-nn)
Cite as:	arXiv:2412.10094 [cond-mat.dis-nn]
	(or arXiv:2412.10094v1 [cond-mat.dis-nn] for this version)
	https://doi.org/10.48550/arXiv.2412.10094

Condensed Matter > Disordered Systems and Neural Networks

Title:Glassy dynamics near the learnability transition in deep recurrent networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators