Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective

Huang, Wei; Li, Yayong; Du, Weitao; Da Xu, Richard Yi; Yin, Jie; Chen, Ling; Zhang, Miao

Computer Science > Machine Learning

arXiv:2103.03113v2 (cs)

[Submitted on 3 Mar 2021 (v1), revised 6 Oct 2021 (this version, v2), latest version 21 Apr 2022 (v3)]

Title:Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective

Authors:Wei Huang, Yayong Li, Weitao Du, Richard Yi Da Xu, Jie Yin, Ling Chen, Miao Zhang

View PDF

Abstract:Graph convolutional networks (GCNs) and their variants have achieved great success in dealing with graph-structured data. However, it is well known that deep GCNs suffer from the over-smoothing problem, where node representations tend to be indistinguishable as more layers are stacked up. The theoretical research to date on deep GCNs has focused primarily on expressive power rather than trainability, an optimization perspective. Compared to expressivity, trainability attempts to address a more fundamental question: given a sufficiently expressive space of models, can we successfully find a good solution by gradient descent-based optimizer? This work fills this gap by exploiting the Graph Neural Tangent Kernel (GNTK), which governs the optimization trajectory under gradient descent for wide GCNs. We formulate the asymptotic behaviors of GNTK in the large depth, which enables us to reveal the dropping trainability of wide and deep GCNs at an exponential rate in the optimization process. Additionally, we extend our theoretical framework to analyze residual connection-resemble techniques, which are found to be only able to mildly mitigate the exponential decay of trainability. To overcome the exponential decay problem more fundamentally, we propose Critical DropEdge, a connectivity-aware and graph-adaptive sampling method, inspired by our theoretical insights on trainability. Experimental evaluation consistently confirms using our proposed method can achieve better results compared to relevant counterparts with both infinite-width and finite-width.

Comments:	24 pages
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2103.03113 [cs.LG]
	(or arXiv:2103.03113v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2103.03113

Submission history

From: Wei Huang [view email]
[v1] Wed, 3 Mar 2021 11:06:12 UTC (4,250 KB)
[v2] Wed, 6 Oct 2021 06:53:41 UTC (7,857 KB)
[v3] Thu, 21 Apr 2022 11:10:33 UTC (3,542 KB)

Computer Science > Machine Learning

Title:Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators