Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

Gu, Shuhao; Feng, Yang

Computer Science > Computation and Language

arXiv:2011.00678 (cs)

[Submitted on 2 Nov 2020 (v1), last revised 30 Nov 2020 (this version, v3)]

Title:Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

Authors:Shuhao Gu, Yang Feng

View PDF

Abstract:Neural machine translation (NMT) models usually suffer from catastrophic forgetting during continual training where the models tend to gradually forget previously learned knowledge and swing to fit the newly added data which may have a different distribution, e.g. a different domain. Although many methods have been proposed to solve this problem, we cannot get to know what causes this phenomenon yet. Under the background of domain adaptation, we investigate the cause of catastrophic forgetting from the perspectives of modules and parameters (neurons). The investigation on the modules of the NMT model shows that some modules have tight relation with the general-domain knowledge while some other modules are more essential in the domain adaptation. And the investigation on the parameters shows that some parameters are important for both the general-domain and in-domain translation and the great change of them during continual training brings about the performance decline in general-domain. We conduct experiments across different language pairs and domains to ensure the validity and reliability of our findings.

Comments:	Coling2020 long paper
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2011.00678 [cs.CL]
	(or arXiv:2011.00678v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2011.00678

Submission history

From: Shuhao Gu [view email]
[v1] Mon, 2 Nov 2020 01:55:06 UTC (2,177 KB)
[v2] Tue, 3 Nov 2020 08:49:42 UTC (2,177 KB)
[v3] Mon, 30 Nov 2020 06:56:52 UTC (2,181 KB)

Computer Science > Computation and Language

Title:Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators