Mining Knowledge for Natural Language Inference from Wikipedia Categories

Chen, Mingda; Chu, Zewei; Stratos, Karl; Gimpel, Kevin

Computer Science > Computation and Language

arXiv:2010.01239 (cs)

[Submitted on 3 Oct 2020]

Title:Mining Knowledge for Natural Language Inference from Wikipedia Categories

Authors:Mingda Chen, Zewei Chu, Karl Stratos, Kevin Gimpel

View PDF

Abstract:Accurate lexical entailment (LE) and natural language inference (NLI) often require large quantities of costly annotations. To alleviate the need for labeled data, we introduce WikiNLI: a resource for improving model performance on NLI and LE tasks. It contains 428,899 pairs of phrases constructed from naturally annotated category hierarchies in Wikipedia. We show that we can improve strong baselines such as BERT and RoBERTa by pretraining them on WikiNLI and transferring the models on downstream tasks. We conduct systematic comparisons with phrases extracted from other knowledge bases such as WordNet and Wikidata to find that pretraining on WikiNLI gives the best performance. In addition, we construct WikiNLI in other languages, and show that pretraining on them improves performance on NLI tasks of corresponding languages.

Comments:	Findings of EMNLP 2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2010.01239 [cs.CL]
	(or arXiv:2010.01239v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.01239

Submission history

From: Mingda Chen [view email]
[v1] Sat, 3 Oct 2020 00:45:01 UTC (282 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Mingda Chen
Zewei Chu
Karl Stratos
Kevin Gimpel

export BibTeX citation

Computer Science > Computation and Language

Title:Mining Knowledge for Natural Language Inference from Wikipedia Categories

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Mining Knowledge for Natural Language Inference from Wikipedia Categories

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators