GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales

Tai, Yuehong Cassandra; Patni, Khushi Navin; Hemauer, Nicholas Daniel; Desmarais, Bruce; Lin, Yu-Ru

Computer Science > Artificial Intelligence

arXiv:2502.14943 (cs)

[Submitted on 20 Feb 2025 (v1), last revised 25 Feb 2025 (this version, v3)]

Title:GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales

Authors:Yuehong Cassandra Tai, Khushi Navin Patni, Nicholas Daniel Hemauer, Bruce Desmarais, Yu-Ru Lin

View PDF HTML (experimental)

Abstract:Despite recent advances in understanding the capabilities and limits of generative artificial intelligence (GenAI) models, we are just beginning to understand their capacity to assess and reason about the veracity of content. We evaluate multiple GenAI models across tasks that involve the rating of, and perceived reasoning about, the credibility of information. The information in our experiments comes from content that subnational U.S. politicians post to Facebook. We find that GPT-4o, one of the most used AI models in consumer applications, outperforms other models, but all models exhibit only moderate agreement with human coders. Importantly, even when GenAI models accurately identify low-credibility content, their reasoning relies heavily on linguistic features and ``hard'' criteria, such as the level of detail, source reliability, and language formality, rather than an understanding of veracity. We also assess the effectiveness of summarized versus full content inputs, finding that summarized content holds promise for improving efficiency without sacrificing accuracy. While GenAI has the potential to support human fact-checkers in scaling misinformation detection, our results caution against relying solely on these models.

Comments:	Accepted for publication in the 17th ACM Web Science Conference 2025
Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2502.14943 [cs.AI]
	(or arXiv:2502.14943v3 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2502.14943

Submission history

From: Yuehong Cassandra Tai [view email]
[v1] Thu, 20 Feb 2025 17:47:40 UTC (1,932 KB)
[v2] Mon, 24 Feb 2025 13:31:36 UTC (1,016 KB)
[v3] Tue, 25 Feb 2025 19:06:04 UTC (1,007 KB)

Computer Science > Artificial Intelligence

Title:GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators