Training Dynamics of Deep Network Linear Regions

Humayun, Ahmed Imtiaz; Balestriero, Randall; Baraniuk, Richard

Computer Science > Machine Learning

arXiv:2310.12977 (cs)

[Submitted on 19 Oct 2023]

Title:Training Dynamics of Deep Network Linear Regions

Authors:Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk

View PDF

Abstract:The study of Deep Network (DN) training dynamics has largely focused on the evolution of the loss function, evaluated on or around train and test set data points. In fact, many DN phenomenon were first introduced in literature with that respect, e.g., double descent, grokking. In this study, we look at the training dynamics of the input space partition or linear regions formed by continuous piecewise affine DNs, e.g., networks with (leaky)ReLU nonlinearities. First, we present a novel statistic that encompasses the local complexity (LC) of the DN based on the concentration of linear regions inside arbitrary dimensional neighborhoods around data points. We observe that during training, the LC around data points undergoes a number of phases, starting with a decreasing trend after initialization, followed by an ascent and ending with a final descending trend. Using exact visualization methods, we come across the perplexing observation that during the final LC descent phase of training, linear regions migrate away from training and test samples towards the decision boundary, making the DN input-output nearly linear everywhere else. We also observe that the different LC phases are closely related to the memorization and generalization performance of the DN, especially during grokking.

Comments:	14 pages, 14 figures
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2310.12977 [cs.LG]
	(or arXiv:2310.12977v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.12977

Submission history

From: Ahmed Imtiaz Humayun [view email]
[v1] Thu, 19 Oct 2023 17:59:44 UTC (26,185 KB)

Computer Science > Machine Learning

Title:Training Dynamics of Deep Network Linear Regions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Training Dynamics of Deep Network Linear Regions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators