Anatomizing Deep Learning Inference in Web Browsers

Wang, Qipeng; Jiang, Shiqi; Chen, Zhenpeng; Cao, Xu; Li, Yuanchun; Li, Aoyu; Ma, Yun; Cao, Ting; Liu, Xuanzhe

Computer Science > Machine Learning

arXiv:2402.05981 (cs)

[Submitted on 8 Feb 2024 (v1), last revised 25 Jul 2024 (this version, v2)]

Title:Anatomizing Deep Learning Inference in Web Browsers

Authors:Qipeng Wang, Shiqi Jiang, Zhenpeng Chen, Xu Cao, Yuanchun Li, Aoyu Li, Yun Ma, Ting Cao, Xuanzhe Liu

View PDF HTML (experimental)

Abstract:Web applications have increasingly adopted Deep Learning (DL) through in-browser inference, wherein DL inference performs directly within Web browsers. The actual performance of in-browser inference and its impacts on the quality of experience (QoE) remain unexplored, and urgently require new QoE measurements beyond traditional ones, e.g., mainly focusing on page load time. To bridge this gap, we make the first comprehensive performance measurement of in-browser inference to date. Our approach proposes new metrics to measure in-browser inference: responsiveness, smoothness, and inference accuracy. Our extensive analysis involves 9 representative DL models across Web browsers of 50 popular PC devices and 20 mobile devices. The results reveal that in-browser inference exhibits a substantial latency gap, averaging 16.9 times slower on CPU and 4.9 times slower on GPU compared to native inference on PC devices. The gap on mobile CPU and mobile GPU is 15.8 times and 7.8 times, respectively. Furthermore, we identify contributing factors to such latency gap, including underutilized hardware instruction sets, inherent overhead in the runtime environment, resource contention within the browser, and inefficiencies in software libraries and GPU abstractions. Additionally, in-browser inference imposes significant memory demands, at times exceeding 334.6 times the size of the DL models themselves, partly attributable to suboptimal memory management. We also observe that in-browser inference leads to a significant 67.2% increase in the time it takes for GUI components to render within Web browsers, significantly affecting the overall user QoE of Web applications reliant on this technology

Comments:	Accepted by ACM Transactions on Software Engineering and Methodology (TOSEM)
Subjects:	Machine Learning (cs.LG); Performance (cs.PF)
Cite as:	arXiv:2402.05981 [cs.LG]
	(or arXiv:2402.05981v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.05981

Submission history

From: Qipeng Wang [view email]
[v1] Thu, 8 Feb 2024 08:02:57 UTC (2,461 KB)
[v2] Thu, 25 Jul 2024 13:37:16 UTC (2,690 KB)

Computer Science > Machine Learning

Title:Anatomizing Deep Learning Inference in Web Browsers

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Anatomizing Deep Learning Inference in Web Browsers

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators