Measuring and regularizing networks in function space

Benjamin, Ari S.; Rolnick, David; Kording, Konrad

Computer Science > Neural and Evolutionary Computing

arXiv:1805.08289 (cs)

[Submitted on 21 May 2018 (v1), last revised 26 Jun 2019 (this version, v3)]

Title:Measuring and regularizing networks in function space

Authors:Ari S. Benjamin, David Rolnick, Konrad Kording

View PDF

Abstract:To optimize a neural network one often thinks of optimizing its parameters, but it is ultimately a matter of optimizing the function that maps inputs to outputs. Since a change in the parameters might serve as a poor proxy for the change in the function, it is of some concern that primacy is given to parameters but that the correspondence has not been tested. Here, we show that it is simple and computationally feasible to calculate distances between functions in a $L^2$ Hilbert space. We examine how typical networks behave in this space, and compare how parameter $\ell^2$ distances compare to function $L^2$ distances between various points of an optimization trajectory. We find that the two distances are nontrivially related. In particular, the $L^2/\ell^2$ ratio decreases throughout optimization, reaching a steady value around when test error plateaus. We then investigate how the $L^2$ distance could be applied directly to optimization. We first propose that in multitask learning, one can avoid catastrophic forgetting by directly limiting how much the input/output function changes between tasks. Secondly, we propose a new learning rule that constrains the distance a network can travel through $L^2$-space in any one update. This allows new examples to be learned in a way that minimally interferes with what has previously been learned. These applications demonstrate how one can measure and regularize function distances directly, without relying on parameters or local approximations like loss curvature.

Comments:	Presented at ICLR 2019
Subjects:	Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1805.08289 [cs.NE]
	(or arXiv:1805.08289v3 [cs.NE] for this version)
	https://doi.org/10.48550/arXiv.1805.08289
Journal reference:	International Conference on Learning Representations, 2019, https://openreview.net/pdf?id=SkMwpiR9Y7

Submission history

From: Ari Benjamin [view email]
[v1] Mon, 21 May 2018 21:03:21 UTC (6,405 KB)
[v2] Mon, 3 Dec 2018 22:17:51 UTC (8,453 KB)
[v3] Wed, 26 Jun 2019 19:04:34 UTC (8,454 KB)

Computer Science > Neural and Evolutionary Computing

Title:Measuring and regularizing networks in function space

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Neural and Evolutionary Computing

Title:Measuring and regularizing networks in function space

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators