ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models

Miliani, Martina; Auriemma, Serena; Bondielli, Alessandro; Chersoni, Emmanuele; Passaro, Lucia; Sucameli, Irene; Lenci, Alessandro

Computer Science > Computation and Language

arXiv:2502.15487 (cs)

[Submitted on 21 Feb 2025 (v1), last revised 26 Feb 2025 (this version, v2)]

Title:ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models

Authors:Martina Miliani, Serena Auriemma, Alessandro Bondielli, Emmanuele Chersoni, Lucia Passaro, Irene Sucameli, Alessandro Lenci

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) are increasingly used in tasks requiring interpretive and inferential accuracy. In this paper, we introduce ExpliCa, a new dataset for evaluating LLMs in explicit causal reasoning. ExpliCa uniquely integrates both causal and temporal relations presented in different linguistic orders and explicitly expressed by linguistic connectives. The dataset is enriched with crowdsourced human acceptability ratings. We tested LLMs on ExpliCa through prompting and perplexity-based metrics. We assessed seven commercial and open-source LLMs, revealing that even top models struggle to reach 0.80 accuracy. Interestingly, models tend to confound temporal relations with causal ones, and their performance is also strongly influenced by the linguistic order of the events. Finally, perplexity-based scores and prompting performance are differently affected by model size.

Comments:	Submitted to ACL 2025
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
MSC classes:	68T50, 68T07
ACM classes:	I.2.7
Cite as:	arXiv:2502.15487 [cs.CL]
	(or arXiv:2502.15487v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.15487

Submission history

From: Martina Miliani [view email]
[v1] Fri, 21 Feb 2025 14:23:14 UTC (2,697 KB)
[v2] Wed, 26 Feb 2025 07:15:45 UTC (2,698 KB)

Computer Science > Computation and Language

Title:ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators