Applying Graph Explanation to Operator Fusion

Mills, Keith G.; Qharabagh, Muhammad Fetrat; Qiu, Weichen; Han, Fred X.; Salameh, Mohammad; Lu, Wei; Jui, Shangling; Niu, Di

Computer Science > Machine Learning

arXiv:2501.00636 (cs)

[Submitted on 31 Dec 2024]

Title:Applying Graph Explanation to Operator Fusion

Authors:Keith G. Mills, Muhammad Fetrat Qharabagh, Weichen Qiu, Fred X. Han, Mohammad Salameh, Wei Lu, Shangling Jui, Di Niu

View PDF HTML (experimental)

Abstract:Layer fusion techniques are critical to improving the inference efficiency of deep neural networks (DNN) for deployment. Fusion aims to lower inference costs by reducing data transactions between an accelerator's on-chip buffer and DRAM. This is accomplished by grouped execution of multiple operations like convolution and activations together into single execution units - fusion groups. However, on-chip buffer capacity limits fusion group size and optimizing fusion on whole DNNs requires partitioning into multiple fusion groups. Finding the optimal groups is a complex problem where the presence of invalid solutions hampers traditional search algorithms and demands robust approaches. In this paper we incorporate Explainable AI, specifically Graph Explanation Techniques (GET), into layer fusion. Given an invalid fusion group, we identify the operations most responsible for group invalidity, then use this knowledge to recursively split the original fusion group via a greedy tree-based algorithm to minimize DRAM access. We pair our scheme with common algorithms and optimize DNNs on two types of layer fusion: Line-Buffer Depth First (LBDF) and Branch Requirement Reduction (BRR). Experiments demonstrate the efficacy of our scheme on several popular and classical convolutional neural networks like ResNets and MobileNets. Our scheme achieves over 20% DRAM Access reduction on EfficientNet-B3.

Comments:	DAC'23 WIP Poster; 8 pages, 5 Figures 5 Tables
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2501.00636 [cs.LG]
	(or arXiv:2501.00636v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2501.00636

Submission history

From: Keith Mills [view email]
[v1] Tue, 31 Dec 2024 20:22:10 UTC (250 KB)

Computer Science > Machine Learning

Title:Applying Graph Explanation to Operator Fusion

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Applying Graph Explanation to Operator Fusion

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators