Evaluating how LLM annotations represent diverse views on contentious topics

Brown, Megan A.; Atreja, Shubham; Hemphill, Libby; Wu, Patrick Y.

Computer Science > Computation and Language

arXiv:2503.23243 (cs)

[Submitted on 29 Mar 2025]

Title:Evaluating how LLM annotations represent diverse views on contentious topics

Authors:Megan A. Brown, Shubham Atreja, Libby Hemphill, Patrick Y. Wu

View PDF HTML (experimental)

Abstract:Researchers have proposed the use of generative large language models (LLMs) to label data for both research and applied settings. This literature emphasizes the improved performance of LLMs relative to other natural language models, noting that LLMs typically outperform other models on standard metrics such as accuracy, precision, recall, and F1 score. However, previous literature has also highlighted the bias embedded in language models, particularly around contentious topics such as potentially toxic content. This bias could result in labels applied by LLMs that disproportionately align with majority groups over a more diverse set of viewpoints. In this paper, we evaluate how LLMs represent diverse viewpoints on these contentious tasks. Across four annotation tasks on four datasets, we show that LLMs do not show substantial disagreement with annotators on the basis of demographics. Instead, the model, prompt, and disagreement between human annotators on the labeling task are far more predictive of LLM agreement. Our findings suggest that when using LLMs to annotate data, under-representing the views of particular groups is not a substantial concern. We conclude with a discussion of the implications for researchers and practitioners.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Cite as:	arXiv:2503.23243 [cs.CL]
	(or arXiv:2503.23243v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2503.23243

Submission history

From: Megan Brown [view email]
[v1] Sat, 29 Mar 2025 22:53:15 UTC (71 KB)

Computer Science > Computation and Language

Title:Evaluating how LLM annotations represent diverse views on contentious topics

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Evaluating how LLM annotations represent diverse views on contentious topics

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators