NUMTEMP: A real-world benchmark to verify claims with statistical and temporal expressions

V, Venktesh; Anand, Abhijit; Anand, Avishek; Setty, Vinay

Computer Science > Computation and Language

arXiv:2403.17169v1 (cs)

[Submitted on 25 Mar 2024 (this version), latest version 1 May 2024 (v3)]

Title:NUMTEMP: A real-world benchmark to verify claims with statistical and temporal expressions

Authors:Venktesh V, Abhijit Anand, Avishek Anand, Vinay Setty

View PDF

Abstract:Automated fact checking has gained immense interest to tackle the growing misinformation in the digital era. Existing systems primarily focus on synthetic claims on Wikipedia, and noteworthy progress has also been made on real-world claims. In this work, we release Numtemp, a diverse, multi-domain dataset focused exclusively on numerical claims, encompassing temporal, statistical and diverse aspects with fine-grained metadata and an evidence collection without leakage. This addresses the challenge of verifying real-world numerical claims, which are complex and often lack precise information, not addressed by existing works that mainly focus on synthetic claims. We evaluate and quantify the limitations of existing solutions for the task of verifying numerical claims. We also evaluate claim decomposition based methods, numerical understanding based models and our best baselines achieves a macro-F1 of 58.32. This demonstrates that Numtemp serves as a challenging evaluation set for numerical claim verification.

Comments:	17 pages, 1 figure
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.17169 [cs.CL]
	(or arXiv:2403.17169v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.17169

Submission history

From: Venktesh V [view email]
[v1] Mon, 25 Mar 2024 20:36:03 UTC (3,423 KB)
[v2] Tue, 30 Apr 2024 08:55:23 UTC (3,906 KB)
[v3] Wed, 1 May 2024 06:27:24 UTC (3,905 KB)

Computer Science > Computation and Language

Title:NUMTEMP: A real-world benchmark to verify claims with statistical and temporal expressions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:NUMTEMP: A real-world benchmark to verify claims with statistical and temporal expressions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators