Principles for Initialization and Architecture Selection in Graph Neural Networks with ReLU Activations

DeZoort, Gage; Hanin, Boris

Statistics > Machine Learning

arXiv:2306.11668 (stat)

[Submitted on 20 Jun 2023]

Title:Principles for Initialization and Architecture Selection in Graph Neural Networks with ReLU Activations

Authors:Gage DeZoort, Boris Hanin

View PDF

Abstract:This article derives and validates three principles for initialization and architecture selection in finite width graph neural networks (GNNs) with ReLU activations. First, we theoretically derive what is essentially the unique generalization to ReLU GNNs of the well-known He-initialization. Our initialization scheme guarantees that the average scale of network outputs and gradients remains order one at initialization. Second, we prove in finite width vanilla ReLU GNNs that oversmoothing is unavoidable at large depth when using fixed aggregation operator, regardless of initialization. We then prove that using residual aggregation operators, obtained by interpolating a fixed aggregation operator with the identity, provably alleviates oversmoothing at initialization. Finally, we show that the common practice of using residual connections with a fixup-type initialization provably avoids correlation collapse in final layer features at initialization. Through ablation studies we find that using the correct initialization, residual aggregation operators, and residual connections in the forward pass significantly and reliably speeds up early training dynamics in deep ReLU GNNs on a variety of tasks.

Comments:	Comments appreciated
Subjects:	Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); High Energy Physics - Experiment (hep-ex); Probability (math.PR)
Cite as:	arXiv:2306.11668 [stat.ML]
	(or arXiv:2306.11668v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2306.11668

Submission history

From: Boris Hanin [view email]
[v1] Tue, 20 Jun 2023 16:40:41 UTC (2,593 KB)

Statistics > Machine Learning

Title:Principles for Initialization and Architecture Selection in Graph Neural Networks with ReLU Activations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Principles for Initialization and Architecture Selection in Graph Neural Networks with ReLU Activations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators