Magnitude Invariant Parametrizations Improve Hypernetwork Learning

Ortiz, Jose Javier Gonzalez; Guttag, John; Dalca, Adrian

Computer Science > Machine Learning

arXiv:2304.07645 (cs)

[Submitted on 15 Apr 2023 (v1), last revised 29 Jun 2023 (this version, v2)]

Title:Magnitude Invariant Parametrizations Improve Hypernetwork Learning

Authors:Jose Javier Gonzalez Ortiz, John Guttag, Adrian Dalca

View PDF

Abstract:Hypernetworks, neural networks that predict the parameters of another neural network, are powerful models that have been successfully used in diverse applications from image generation to multi-task learning. Unfortunately, existing hypernetworks are often challenging to train. Training typically converges far more slowly than for non-hypernetwork models, and the rate of convergence can be very sensitive to hyperparameter choices. In this work, we identify a fundamental and previously unidentified problem that contributes to the challenge of training hypernetworks: a magnitude proportionality between the inputs and outputs of the hypernetwork. We demonstrate both analytically and empirically that this can lead to unstable optimization, thereby slowing down convergence, and sometimes even preventing any learning. We present a simple solution to this problem using a revised hypernetwork formulation that we call Magnitude Invariant Parametrizations (MIP). We demonstrate the proposed solution on several hypernetwork tasks, where it consistently stabilizes training and achieves faster convergence. Furthermore, we perform a comprehensive ablation study including choices of activation function, normalization strategies, input dimensionality, and hypernetwork architecture; and find that MIP improves training in all scenarios. We provide easy-to-use code that can turn existing networks into MIP-based hypernetworks.

Comments:	Source code at this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2304.07645 [cs.LG]
	(or arXiv:2304.07645v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2304.07645

Submission history

From: Jose Javier Gonzalez Ortiz [view email]
[v1] Sat, 15 Apr 2023 22:18:29 UTC (1,354 KB)
[v2] Thu, 29 Jun 2023 16:38:42 UTC (2,274 KB)

Computer Science > Machine Learning

Title:Magnitude Invariant Parametrizations Improve Hypernetwork Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Magnitude Invariant Parametrizations Improve Hypernetwork Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators