Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense

Cui, Wanyun; Chen, Xingran

Computer Science > Computation and Language

arXiv:2109.02572 (cs)

[Submitted on 6 Sep 2021 (v1), last revised 13 Mar 2022 (this version, v3)]

Title:Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense

Authors:Wanyun Cui, Xingran Chen

View PDF

Abstract:We study how to enhance text representation via textual commonsense. We point out that commonsense has the nature of domain discrepancy. Namely, commonsense has different data formats and is domain-independent from the downstream task. This nature brings challenges to introducing commonsense in general text understanding tasks. A typical method of introducing textual knowledge is continuing pre-training over the commonsense corpus. However, it will cause catastrophic forgetting to the downstream task due to the domain discrepancy. In addition, previous methods of directly using textual descriptions as extra input information cannot apply to large-scale commonsense.
In this paper, we propose to use large-scale out-of-domain commonsense to enhance text representation. In order to effectively incorporate the commonsense, we proposed OK-Transformer (\underline{O}ut-of-domain \underline{K}nowledge enhanced \underline{Transformer}). OK-Transformer effectively integrates commonsense descriptions and enhances them to the target text representation. In addition, OK-Transformer can adapt to the Transformer-based language models (e.g. BERT, RoBERTa) for free, without pre-training on large-scale unsupervised corpora. We have verified the effectiveness of OK-Transformer in multiple applications such as commonsense reasoning, general text classification, and low-resource commonsense settings.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2109.02572 [cs.CL]
	(or arXiv:2109.02572v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2109.02572

Submission history

From: Wanyun Cui [view email]
[v1] Mon, 6 Sep 2021 16:16:10 UTC (1,249 KB)
[v2] Sun, 19 Sep 2021 12:58:19 UTC (1,303 KB)
[v3] Sun, 13 Mar 2022 12:52:37 UTC (1,294 KB)

Computer Science > Computation and Language

Title:Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators