On-Device Language Models: A Comprehensive Review

Xu, Jiajun; Li, Zhiyuan; Chen, Wei; Wang, Qun; Gao, Xin; Cai, Qi; Ling, Ziyuan

Abstract:The advent of large language models (LLMs) revolutionized natural language processing applications, and running LLMs on edge devices has become increasingly attractive for reasons including reduced latency, data localization, and personalized user experiences. This comprehensive review examines the challenges of deploying computationally expensive LLMs on resource-constrained devices and explores innovative solutions across multiple domains. The paper investigates the development of on-device language models, their efficient architectures, including parameter sharing and modular designs, as well as state-of-the-art compression techniques like quantization, pruning, and knowledge distillation. Hardware acceleration strategies and collaborative edge-cloud deployment approaches are analyzed, highlighting the intricate balance between performance and resource utilization. Case studies of on-device language models from major mobile manufacturers demonstrate real-world applications and potential benefits. The review also addresses critical aspects such as adaptive learning, multi-modal capabilities, and personalization. By identifying key research directions and open challenges, this paper provides a roadmap for future advancements in on-device language models, emphasizing the need for interdisciplinary efforts to realize the full potential of ubiquitous, intelligent computing while ensuring responsible and ethical deployment. For a comprehensive review of research work and educational resources on on-device large language models (LLMs), please visit this https URL. To download and run on-device LLMs, visit this https URL.

Comments:	38 pages, 6 figures
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2409.00088 [cs.CL]
	(or arXiv:2409.00088v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2409.00088

Computer Science > Computation and Language

Title:On-Device Language Models: A Comprehensive Review

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators