Large Language Model Pruning

Huang, Hanjuan; Song, Hao-Jia; Pao, Hsing-Kuo

Computer Science > Computation and Language

arXiv:2406.00030 (cs)

[Submitted on 24 May 2024]

Title:Large Language Model Pruning

Authors:Hanjuan Huang (1) (2), Hao-Jia Song (1), Hsing-Kuo Pao (1) ((1) Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology, Taipei, Taiwan, (2) College of Mechanical and Electrical Engineering, WUYI University, Wuyishan, China)

View PDF HTML (experimental)

Abstract:We surely enjoy the larger the better models for their superior performance in the last couple of years when both the hardware and software support the birth of such extremely huge models. The applied fields include text mining and others. In particular, the success of LLMs on text understanding and text generation draws attention from researchers who have worked on NLP and related areas for years or even decades. On the side, LLMs may suffer from problems like model overfitting, hallucination, and device limitation to name a few. In this work, we suggest a model pruning technique specifically focused on LLMs. The proposed methodology emphasizes the explainability of deep learning models. By having the theoretical foundation, we obtain a trustworthy deep model so that huge models with a massive number of model parameters become not quite necessary. A mutual information-based estimation is adopted to find neurons with redundancy to eliminate. Moreover, an estimator with well-tuned parameters helps to find precise estimation to guide the pruning procedure. At the same time, we also explore the difference between pruning on large-scale models vs. pruning on small-scale models. The choice of pruning criteria is sensitive in small models but not for large-scale models. It is a novel finding through this work. Overall, we demonstrate the superiority of the proposed model to the state-of-the-art models.

Comments:	17 pages, 7 figures, 2 tables
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2406.00030 [cs.CL]
	(or arXiv:2406.00030v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.00030

Submission history

From: Hsing-Kuo Pao Kenneth [view email]
[v1] Fri, 24 May 2024 18:22:15 UTC (783 KB)

Computer Science > Computation and Language

Title:Large Language Model Pruning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Large Language Model Pruning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators