Dataset Augmentation by Mixing Visual Concepts

Rahat, Abdullah Al; Venkateswara, Hemanth

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.15358 (cs)

[Submitted on 19 Dec 2024]

Title:Dataset Augmentation by Mixing Visual Concepts

Authors:Abdullah Al Rahat, Hemanth Venkateswara

View PDF HTML (experimental)

Abstract:This paper proposes a dataset augmentation method by fine-tuning pre-trained diffusion models. Generating images using a pre-trained diffusion model with textual conditioning often results in domain discrepancy between real data and generated images. We propose a fine-tuning approach where we adapt the diffusion model by conditioning it with real images and novel text embeddings. We introduce a unique procedure called Mixing Visual Concepts (MVC) where we create novel text embeddings from image captions. The MVC enables us to generate multiple images which are diverse and yet similar to the real data enabling us to perform effective dataset augmentation. We perform comprehensive qualitative and quantitative evaluations with the proposed dataset augmentation approach showcasing both coarse-grained and finegrained changes in generated images. Our approach outperforms state-of-the-art augmentation techniques on benchmark classification tasks.

Comments:	Accepted at WACV 2025 main conference
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2412.15358 [cs.CV]
	(or arXiv:2412.15358v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.15358

Submission history

From: Md Abdullah Al Rahat Kutubi [view email]
[v1] Thu, 19 Dec 2024 19:42:22 UTC (8,898 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Dataset Augmentation by Mixing Visual Concepts

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Dataset Augmentation by Mixing Visual Concepts

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators