Finding Sparse Structures for Domain Specific Neural Machine Translation

Liang, Jianze; Zhao, Chengqi; Wang, Mingxuan; Qiu, Xipeng; Li, Lei

Computer Science > Computation and Language

arXiv:2012.10586 (cs)

[Submitted on 19 Dec 2020 (v1), last revised 26 Mar 2021 (this version, v2)]

Title:Finding Sparse Structures for Domain Specific Neural Machine Translation

Authors:Jianze Liang, Chengqi Zhao, Mingxuan Wang, Xipeng Qiu, Lei Li

View PDF

Abstract:Neural machine translation often adopts the fine-tuning approach to adapt to specific domains. However, nonrestricted fine-tuning can easily degrade on the general domain and over-fit to the target domain. To mitigate the issue, we propose Prune-Tune, a novel domain adaptation method via gradual pruning. It learns tiny domain-specific sub-networks during fine-tuning on new domains. Prune-Tune alleviates the over-fitting and the degradation problem without model modification. Furthermore, Prune-Tune is able to sequentially learn a single network with multiple disjoint domain-specific sub-networks for multiple domains. Empirical experiment results show that Prune-Tune outperforms several strong competitors in the target domain test set without sacrificing the quality on the general domain in both single and multi-domain settings. The source code and data are available at this https URL.

Comments:	Accepted to AAAI 2021
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2012.10586 [cs.CL]
	(or arXiv:2012.10586v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2012.10586

Submission history

From: Jianze Liang [view email]
[v1] Sat, 19 Dec 2020 03:33:27 UTC (121 KB)
[v2] Fri, 26 Mar 2021 16:57:21 UTC (2,558 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-12

Change to browse by:

cs
cs.AI

References & Citations

DBLP - CS Bibliography

listing | bibtex

Jianze Liang
Mingxuan Wang
Xipeng Qiu
Lei Li

export BibTeX citation

Computer Science > Computation and Language

Title:Finding Sparse Structures for Domain Specific Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Finding Sparse Structures for Domain Specific Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators