Rapid Deployment of DNNs for Edge Computing via Structured Pruning at Initialization

Eccles, Bailey J.; Wong, Leon; Varghese, Blesson

Computer Science > Machine Learning

arXiv:2404.16877 (cs)

[Submitted on 22 Apr 2024]

Title:Rapid Deployment of DNNs for Edge Computing via Structured Pruning at Initialization

Authors:Bailey J. Eccles, Leon Wong, Blesson Varghese

View PDF HTML (experimental)

Abstract:Edge machine learning (ML) enables localized processing of data on devices and is underpinned by deep neural networks (DNNs). However, DNNs cannot be easily run on devices due to their substantial computing, memory and energy requirements for delivering performance that is comparable to cloud-based ML. Therefore, model compression techniques, such as pruning, have been considered. Existing pruning methods are problematic for edge ML since they: (1) Create compressed models that have limited runtime performance benefits (using unstructured pruning) or compromise the final model accuracy (using structured pruning), and (2) Require substantial compute resources and time for identifying a suitable compressed DNN model (using neural architecture search). In this paper, we explore a new avenue, referred to as Pruning-at-Initialization (PaI), using structured pruning to mitigate the above problems. We develop Reconvene, a system for rapidly generating pruned models suited for edge deployments using structured PaI. Reconvene systematically identifies and prunes DNN convolution layers that are least sensitive to structured pruning. Reconvene rapidly creates pruned DNNs within seconds that are up to 16.21x smaller and 2x faster while maintaining the same accuracy as an unstructured PaI counterpart.

Comments:	The 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2404.16877 [cs.LG]
	(or arXiv:2404.16877v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2404.16877

Submission history

From: Bailey Eccles [view email]
[v1] Mon, 22 Apr 2024 10:57:54 UTC (382 KB)

Computer Science > Machine Learning

Title:Rapid Deployment of DNNs for Edge Computing via Structured Pruning at Initialization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Rapid Deployment of DNNs for Edge Computing via Structured Pruning at Initialization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators