More Distinctively Black and Feminine Faces Lead to Increased Stereotyping in Vision-Language Models

Lee, Messi H. J.; Montgomery, Jacob M.; Lai, Calvin K.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.06194 (cs)

[Submitted on 22 May 2024]

Title:More Distinctively Black and Feminine Faces Lead to Increased Stereotyping in Vision-Language Models

Authors:Messi H.J. Lee, Jacob M. Montgomery, Calvin K. Lai

View PDF

Abstract:Vision Language Models (VLMs), exemplified by GPT-4V, adeptly integrate text and vision modalities. This integration enhances Large Language Models' ability to mimic human perception, allowing them to process image inputs. Despite VLMs' advanced capabilities, however, there is a concern that VLMs inherit biases of both modalities in ways that make biases more pervasive and difficult to mitigate. Our study explores how VLMs perpetuate homogeneity bias and trait associations with regards to race and gender. When prompted to write stories based on images of human faces, GPT-4V describes subordinate racial and gender groups with greater homogeneity than dominant groups and relies on distinct, yet generally positive, stereotypes. Importantly, VLM stereotyping is driven by visual cues rather than group membership alone such that faces that are rated as more prototypically Black and feminine are subject to greater stereotyping. These findings suggest that VLMs may associate subtle visual cues related to racial and gender groups with stereotypes in ways that could be challenging to mitigate. We explore the underlying reasons behind this behavior and discuss its implications and emphasize the importance of addressing these biases as VLMs come to mirror human perception.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2407.06194 [cs.CV]
	(or arXiv:2407.06194v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.06194

Submission history

From: Messi H.J. Lee [view email]
[v1] Wed, 22 May 2024 00:45:29 UTC (221 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:More Distinctively Black and Feminine Faces Lead to Increased Stereotyping in Vision-Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:More Distinctively Black and Feminine Faces Lead to Increased Stereotyping in Vision-Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators