Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

Vu, Thuy-Trang; Phung, Dinh; Haffari, Gholamreza

Computer Science > Computation and Language

arXiv:2010.01739 (cs)

[Submitted on 5 Oct 2020]

Title:Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

Authors:Thuy-Trang Vu, Dinh Phung, Gholamreza Haffari

View PDF

Abstract:Recent work has shown the importance of adaptation of broad-coverage contextualised embedding models on the domain of the target task of interest. Current self-supervised adaptation methods are simplistic, as the training signal comes from a small percentage of \emph{randomly} masked-out tokens. In this paper, we show that careful masking strategies can bridge the knowledge gap of masked language models (MLMs) about the domains more effectively by allocating self-supervision where it is needed. Furthermore, we propose an effective training strategy by adversarially masking out those tokens which are harder to reconstruct by the underlying MLM. The adversarial objective leads to a challenging combinatorial optimisation problem over \emph{subsets} of tokens, which we tackle efficiently through relaxation to a variational lowerbound and dynamic programming. On six unsupervised domain adaptation tasks involving named entity recognition, our method strongly outperforms the random masking strategy and achieves up to +1.64 F1 score improvements.

Comments:	EMNLP2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2010.01739 [cs.CL]
	(or arXiv:2010.01739v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.01739

Submission history

From: Thuy-Trang Vu [view email]
[v1] Mon, 5 Oct 2020 01:49:47 UTC (7,508 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Dinh Phung
Gholamreza Haffari

export BibTeX citation

Computer Science > Computation and Language

Title:Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators