LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression

Mao, Yihuan; Wang, Yujing; Wu, Chufan; Zhang, Chen; Wang, Yang; Yang, Yaming; Zhang, Quanlu; Tong, Yunhai; Bai, Jing

Computer Science > Computation and Language

arXiv:2004.04124 (cs)

[Submitted on 8 Apr 2020 (v1), last revised 21 Oct 2020 (this version, v2)]

Title:LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression

Authors:Yihuan Mao, Yujing Wang, Chufan Wu, Chen Zhang, Yang Wang, Yaming Yang, Quanlu Zhang, Yunhai Tong, Jing Bai

View PDF

Abstract:BERT is a cutting-edge language representation model pre-trained by a large corpus, which achieves superior performances on various natural language understanding tasks. However, a major blocking issue of applying BERT to online services is that it is memory-intensive and leads to unsatisfactory latency of user requests, raising the necessity of model compression. Existing solutions leverage the knowledge distillation framework to learn a smaller model that imitates the behaviors of BERT. However, the training procedure of knowledge distillation is expensive itself as it requires sufficient training data to imitate the teacher model. In this paper, we address this issue by proposing a hybrid solution named LadaBERT (Lightweight adaptation of BERT through hybrid model compression), which combines the advantages of different model compression methods, including weight pruning, matrix factorization and knowledge distillation. LadaBERT achieves state-of-the-art accuracy on various public datasets while the training overheads can be reduced by an order of magnitude.

Comments:	COLING2020
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2004.04124 [cs.CL]
	(or arXiv:2004.04124v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2004.04124

Submission history

From: Yihuan Mao [view email]
[v1] Wed, 8 Apr 2020 17:18:56 UTC (451 KB)
[v2] Wed, 21 Oct 2020 15:15:11 UTC (448 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-04

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yihuan Mao
Yujing Wang
Chen Zhang
Yang Wang
Yunhai Tong

…

export BibTeX citation

Computer Science > Computation and Language

Title:LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators