Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation"

Fernández, Daniel Gallo; Matisan, Răzvan-Andrei; Muñoz, Alejandro Monroy; Partyka, Janusz

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.19996 (cs)

[Submitted on 29 Jul 2024]

Title:Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation"

Authors:Daniel Gallo Fernández, Răzvan-Andrei Matisan, Alejandro Monroy Muñoz, Janusz Partyka

View PDF HTML (experimental)

Abstract:Text-to-image generative models often present issues regarding fairness with respect to certain sensitive attributes, such as gender or skin tone. This study aims to reproduce the results presented in "ITI-GEN: Inclusive Text-to-Image Generation" by Zhang et al. (2023a), which introduces a model to improve inclusiveness in these kinds of models. We show that most of the claims made by the authors about ITI-GEN hold: it improves the diversity and quality of generated images, it is scalable to different domains, it has plug-and-play capabilities, and it is efficient from a computational point of view. However, ITI-GEN sometimes uses undesired attributes as proxy features and it is unable to disentangle some pairs of (correlated) attributes such as gender and baldness. In addition, when the number of considered attributes increases, the training time grows exponentially and ITI-GEN struggles to generate inclusive images for all elements in the joint distribution. To solve these issues, we propose using Hard Prompt Search with negative prompting, a method that does not require training and that handles negation better than vanilla Hard Prompt Search. Nonetheless, Hard Prompt Search (with or without negative prompting) cannot be used for continuous attributes that are hard to express in natural language, an area where ITI-GEN excels as it is guided by images during training. Finally, we propose combining ITI-GEN and Hard Prompt Search with negative prompting.

Comments:	Accepted to TMLR, see this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2407.19996 [cs.CV]
	(or arXiv:2407.19996v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.19996

Submission history

From: Daniel Gallo Fernández [view email]
[v1] Mon, 29 Jul 2024 13:27:44 UTC (22,446 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation"

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation"

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators