Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other

Gao, Yifei; Ou, Jie; Wang, Lei; Xiao, Yuting; Xiang, Zhiyuan; Dai, Ruiting; Cheng, Jun

Computer Science > Computation and Language

arXiv:2406.16299 (cs)

[Submitted on 24 Jun 2024]

Title:Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other

Authors:Yifei Gao, Jie Ou, Lei Wang, Yuting Xiao, Zhiyuan Xiang, Ruiting Dai, Jun Cheng

View PDF HTML (experimental)

Abstract:Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization methods stand out. One uses other weights to compensate existing quantization error, while the other transfers the quantization difficulty to other parts in the model. Combining both merits, we introduce Learnable Singular value Increment (LSI) as an advanced solution. LSI uses Singular Value Decomposition to extract singular values of the weights and make them learnable to help weights compensate each other conditioned on activation. Incorporating LSI with existing techniques, we achieve state-of-the-art performance in diverse quantization settings, no matter in weight-only, weight-activation or extremely low bit scenarios. By unleashing the potential of LSI, efficient finetuning on quantized model is no longer a prohibitive problem.

Comments:	Efficient quantization method
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
MSC classes:	F.2.3
Cite as:	arXiv:2406.16299 [cs.CL]
	(or arXiv:2406.16299v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.16299

Submission history

From: Yifei Gao [view email]
[v1] Mon, 24 Jun 2024 03:52:52 UTC (9,080 KB)

Computer Science > Computation and Language

Title:Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators