Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Xing, Chen; Xiao, Wencong; Li, Yong; Lin, Wei

Computer Science > Computation and Language

arXiv:2012.08789v1 (cs)

[Submitted on 16 Dec 2020 (this version), latest version 3 Dec 2021 (v2)]

Title:Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Authors:Chen Xing, Wencong Xiao, Yong Li, Wei Lin

View PDF

Abstract:In this work, we propose to improve the effectiveness of language pre-training methods with the help of mis-predictions during pre-training. Neglecting words in the input sentence that have conflicting semantics with mis-predictions is likely to be the reason of generating mis-predictions at pre-training. Therefore, we hypothesis that mis-predictions during pre-training can act as detectors of the ill focuses of the model. If we train the model to focus more on the conflicts with the mis-predictions while focus less on the rest words in the input sentence, the mis-predictions can be more easily corrected and the entire model could be better trained. Towards this end, we introduce Focusing Less on Context of Mis-predictions(McMisP). In McMisP, we record the co-occurrence information between words to detect the conflicting words with mis-predictions in an unsupervised way. Then McMisP uses such information to guide the attention modules when a mis-prediction occurs. Specifically, several attention modules in the Transformer are optimized to focus more on words in the input sentence that have co-occurred rarely with the mis-predictions and vice versa. Results show that McMisP significantly expedites BERT and ELECTRA and improves their performances on downstream tasks.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2012.08789 [cs.CL]
	(or arXiv:2012.08789v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2012.08789

Submission history

From: Chen Xing [view email]
[v1] Wed, 16 Dec 2020 08:21:51 UTC (158 KB)
[v2] Fri, 3 Dec 2021 17:04:04 UTC (1,201 KB)

Computer Science > Computation and Language

Title:Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators