Hierarchical Training of Deep Neural Networks Using Early Exiting

Sepehri, Yamin; Pad, Pedram; Yüzügüler, Ahmet Caner; Frossard, Pascal; Dunbar, L. Andrea

doi:10.1109/TNNLS.2024.3396628

Computer Science > Computer Vision and Pattern Recognition

arXiv:2303.02384 (cs)

[Submitted on 4 Mar 2023 (v1), last revised 20 May 2024 (this version, v4)]

Title:Hierarchical Training of Deep Neural Networks Using Early Exiting

Authors:Yamin Sepehri, Pedram Pad, Ahmet Caner Yüzügüler, Pascal Frossard, L. Andrea Dunbar

View PDF HTML (experimental)

Abstract:Deep neural networks provide state-of-the-art accuracy for vision tasks but they require significant resources for training. Thus, they are trained on cloud servers far from the edge devices that acquire the data. This issue increases communication cost, runtime and privacy concerns. In this study, a novel hierarchical training method for deep neural networks is proposed that uses early exits in a divided architecture between edge and cloud workers to reduce the communication cost, training runtime and privacy concerns. The method proposes a brand-new use case for early exits to separate the backward pass of neural networks between the edge and the cloud during the training phase. We address the issues of most available methods that due to the sequential nature of the training phase, cannot train the levels of hierarchy simultaneously or they do it with the cost of compromising privacy. In contrast, our method can use both edge and cloud workers simultaneously, does not share the raw input data with the cloud and does not require communication during the backward pass. Several simulations and on-device experiments for different neural network architectures demonstrate the effectiveness of this method. It is shown that the proposed method reduces the training runtime for VGG-16 and ResNet-18 architectures by 29% and 61% in CIFAR-10 classification and by 25% and 81% in Tiny ImageNet classification when the communication with the cloud is done over a low bit rate channel. This gain in the runtime is achieved whilst the accuracy drop is negligible. This method is advantageous for online learning of high-accuracy deep neural networks on sensor-holding low-resource devices such as mobile phones or robots as a part of an edge-cloud system, making them more flexible in facing new tasks and classes of data.

Comments:	Accepted to IEEE Transactions on Neural Networks and Learning Systems (2024), 15 pages, 10 figures, 3 Tables
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
ACM classes:	I.2.6; I.2.10; I.2.11
Cite as:	arXiv:2303.02384 [cs.CV]
	(or arXiv:2303.02384v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2303.02384
Related DOI:	https://doi.org/10.1109/TNNLS.2024.3396628

Submission history

From: Yamin Sepehri [view email]
[v1] Sat, 4 Mar 2023 11:30:16 UTC (1,517 KB)
[v2] Sun, 19 Mar 2023 14:39:50 UTC (1,514 KB)
[v3] Sun, 23 Apr 2023 11:59:54 UTC (1,514 KB)
[v4] Mon, 20 May 2024 20:18:42 UTC (2,759 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Hierarchical Training of Deep Neural Networks Using Early Exiting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Hierarchical Training of Deep Neural Networks Using Early Exiting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators