Analytic Insights into Structure and Rank of Neural Network Hessian Maps

Singh, Sidak Pal; Bachmann, Gregor; Hofmann, Thomas

Computer Science > Machine Learning

arXiv:2106.16225 (cs)

[Submitted on 30 Jun 2021 (v1), last revised 1 Jul 2021 (this version, v2)]

Title:Analytic Insights into Structure and Rank of Neural Network Hessian Maps

Authors:Sidak Pal Singh, Gregor Bachmann, Thomas Hofmann

View PDF

Abstract:The Hessian of a neural network captures parameter interactions through second-order derivatives of the loss. It is a fundamental object of study, closely tied to various problems in deep learning, including model design, optimization, and generalization. Most prior work has been empirical, typically focusing on low-rank approximations and heuristics that are blind to the network structure. In contrast, we develop theoretical tools to analyze the range of the Hessian map, providing us with a precise understanding of its rank deficiency as well as the structural reasons behind it. This yields exact formulas and tight upper bounds for the Hessian rank of deep linear networks, allowing for an elegant interpretation in terms of rank deficiency. Moreover, we demonstrate that our bounds remain faithful as an estimate of the numerical Hessian rank, for a larger class of models such as rectified and hyperbolic tangent networks. Further, we also investigate the implications of model architecture (e.g.~width, depth, bias) on the rank deficiency. Overall, our work provides novel insights into the source and extent of redundancy in overparameterized networks.

Subjects:	Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2106.16225 [cs.LG]
	(or arXiv:2106.16225v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2106.16225

Submission history

From: Sidak Pal Singh [view email]
[v1] Wed, 30 Jun 2021 17:29:58 UTC (36,281 KB)
[v2] Thu, 1 Jul 2021 17:57:50 UTC (36,266 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-06

Change to browse by:

cs
cs.NE
math
math.ST
stat
stat.ML
stat.TH

References & Citations

DBLP - CS Bibliography

listing | bibtex

Sidak Pal Singh
Thomas Hofmann

export BibTeX citation

Computer Science > Machine Learning

Title:Analytic Insights into Structure and Rank of Neural Network Hessian Maps

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Analytic Insights into Structure and Rank of Neural Network Hessian Maps

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators