First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models

Ma, Chi; Huang, Mincong; Zhang, Ying; Wang, Chao; Wang, Yujie; Yu, Lei; Liu, Chuan; Lin, Wei

Computer Science > Computation and Language

arXiv:2408.11393 (cs)

[Submitted on 21 Aug 2024]

Title:First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models

Authors:Chi Ma, Mincong Huang, Ying Zhang, Chao Wang, Yujie Wang, Lei Yu, Chuan Liu, Wei Lin

View PDF HTML (experimental)

Abstract:Dynamic activation (DA) techniques, such as DejaVu and MoEfication, have demonstrated their potential to significantly enhance the inference efficiency of large language models (LLMs). However, these techniques often rely on ReLU activation functions or require additional parameters and training to maintain performance. This paper introduces a training-free Threshold-based Dynamic Activation(TDA) method that leverage sequence information to exploit the inherent sparsity of models across various architectures. This method is designed to accelerate generation speed by 18-25\% without significantly compromising task performance, thereby addressing the limitations of existing DA techniques. Moreover, we delve into the root causes of LLM sparsity and theoretically analyze two of its critical features: history-related activation uncertainty and semantic-irrelevant activation inertia. Our comprehensive analyses not only provide a robust theoretical foundation for DA methods but also offer valuable insights to guide future research in optimizing LLMs for greater efficiency and effectiveness.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2408.11393 [cs.CL]
	(or arXiv:2408.11393v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2408.11393

Submission history

From: Yujie Wang [view email]
[v1] Wed, 21 Aug 2024 07:38:51 UTC (1,937 KB)

Computer Science > Computation and Language

Title:First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators