HeadlineCause: A Dataset of News Headlines for Detecting Causalities

Gusev, Ilya; Tikhonov, Alexey

Computer Science > Computation and Language

arXiv:2108.12626 (cs)

[Submitted on 28 Aug 2021 (v1), last revised 28 Sep 2021 (this version, v2)]

Title:HeadlineCause: A Dataset of News Headlines for Detecting Causalities

Authors:Ilya Gusev, Alexey Tikhonov

View PDF

Abstract:Detecting implicit causal relations in texts is a task that requires both common sense and world knowledge. Existing datasets are focused either on commonsense causal reasoning or explicit causal relations. In this work, we present HeadlineCause, a dataset for detecting implicit causal relations between pairs of news headlines. The dataset includes over 5000 headline pairs from English news and over 9000 headline pairs from Russian news labeled through crowdsourcing. The pairs vary from totally unrelated or belonging to the same general topic to the ones including causation and refutation relations. We also present a set of models and experiments that demonstrates the dataset validity, including a multilingual XLM-RoBERTa based model for causality detection and a GPT-2 based model for possible effects prediction.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2108.12626 [cs.CL]
	(or arXiv:2108.12626v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2108.12626

Submission history

From: Ilya Gusev [view email]
[v1] Sat, 28 Aug 2021 11:12:49 UTC (692 KB)
[v2] Tue, 28 Sep 2021 14:01:26 UTC (835 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-08

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ilya Gusev
Alexey Tikhonov

export BibTeX citation

Computer Science > Computation and Language

Title:HeadlineCause: A Dataset of News Headlines for Detecting Causalities

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:HeadlineCause: A Dataset of News Headlines for Detecting Causalities

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators