Are Models Biased on Text without Gender-related Language?

Belém, Catarina G; Seshadri, Preethi; Razeghi, Yasaman; Singh, Sameer

Computer Science > Computation and Language

arXiv:2405.00588 (cs)

[Submitted on 1 May 2024]

Title:Are Models Biased on Text without Gender-related Language?

Authors:Catarina G Belém, Preethi Seshadri, Yasaman Razeghi, Sameer Singh

View PDF HTML (experimental)

Abstract:Gender bias research has been pivotal in revealing undesirable behaviors in large language models, exposing serious gender stereotypes associated with occupations, and emotions. A key observation in prior work is that models reinforce stereotypes as a consequence of the gendered correlations that are present in the training data. In this paper, we focus on bias where the effect from training data is unclear, and instead address the question: Do language models still exhibit gender bias in non-stereotypical settings? To do so, we introduce UnStereoEval (USE), a novel framework tailored for investigating gender bias in stereotype-free scenarios. USE defines a sentence-level score based on pretraining data statistics to determine if the sentence contain minimal word-gender associations. To systematically benchmark the fairness of popular language models in stereotype-free scenarios, we utilize USE to automatically generate benchmarks without any gender-related language. By leveraging USE's sentence-level score, we also repurpose prior gender bias benchmarks (Winobias and Winogender) for non-stereotypical evaluation. Surprisingly, we find low fairness across all 28 tested models. Concretely, models demonstrate fair behavior in only 9%-41% of stereotype-free sentences, suggesting that bias does not solely stem from the presence of gender-related words. These results raise important questions about where underlying model biases come from and highlight the need for more systematic and comprehensive bias evaluation. We release the full dataset and code at this https URL.

Comments:	In International Conference on Learning Representations 2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Machine Learning (cs.LG)
Cite as:	arXiv:2405.00588 [cs.CL]
	(or arXiv:2405.00588v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.00588

Submission history

From: Catarina Belem [view email]
[v1] Wed, 1 May 2024 15:51:15 UTC (1,967 KB)

Computer Science > Computation and Language

Title:Are Models Biased on Text without Gender-related Language?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Are Models Biased on Text without Gender-related Language?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators