Scaling Up Deep Neural Network Optimization for Edge Inference

Lu, Bingqian; Yang, Jianyi; Ren, Shaolei

Computer Science > Machine Learning

arXiv:2009.00278 (cs)

[Submitted on 1 Sep 2020 (v1), last revised 17 Sep 2020 (this version, v3)]

Title:Scaling Up Deep Neural Network Optimization for Edge Inference

Authors:Bingqian Lu, Jianyi Yang, Shaolei Ren

View PDF

Abstract:Deep neural networks (DNNs) have been increasingly deployed on and integrated with edge devices, such as mobile phones, drones, robots and wearables. To run DNN inference directly on edge devices (a.k.a. edge inference) with a satisfactory performance, optimizing the DNN design (e.g., network architecture and quantization policy) is crucial. While state-of-the-art DNN designs have leveraged performance predictors to speed up the optimization process, they are device-specific (i.e., each predictor for only one target device) and hence cannot scale well in the presence of extremely diverse edge devices. Moreover, even with performance predictors, the optimizer (e.g., search-based optimization) can still be time-consuming when optimizing DNNs for many different devices. In this work, we propose two approaches to scaling up DNN optimization. In the first approach, we reuse the performance predictors built on a proxy device, and leverage the performance monotonicity to scale up the DNN optimization without re-building performance predictors for each different device. In the second approach, we build scalable performance predictors that can estimate the resulting performance (e.g., inference accuracy/latency/energy) given a DNN-device pair, and use a neural network-based automated optimizer that takes both device features and optimization parameters as input and then directly outputs the optimal DNN design without going through a lengthy optimization process for each individual device.

Comments:	Position paper. New algorithm added. Part of the content (from Section 5 "Learning to Optimize") will be presented in the work-in-progress poster session of the ACM/IEEE Symposium on Edge Computing 2020
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2009.00278 [cs.LG]
	(or arXiv:2009.00278v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2009.00278

Submission history

From: Shaolei Ren [view email]
[v1] Tue, 1 Sep 2020 07:47:22 UTC (119 KB)
[v2] Mon, 7 Sep 2020 06:06:22 UTC (230 KB)
[v3] Thu, 17 Sep 2020 07:41:44 UTC (1,087 KB)

Computer Science > Machine Learning

Title:Scaling Up Deep Neural Network Optimization for Edge Inference

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Scaling Up Deep Neural Network Optimization for Edge Inference

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators