Text-Guided Mixup Towards Long-Tailed Image Categorization

Franklin, Richard; Yao, Jiawei; Zhong, Deyang; Qian, Qi; Hu, Juhua

Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.03583 (cs)

[Submitted on 5 Sep 2024]

Title:Text-Guided Mixup Towards Long-Tailed Image Categorization

Authors:Richard Franklin, Jiawei Yao, Deyang Zhong, Qi Qian, Juhua Hu

View PDF HTML (experimental)

Abstract:In many real-world applications, the frequency distribution of class labels for training data can exhibit a long-tailed distribution, which challenges traditional approaches of training deep neural networks that require heavy amounts of balanced data. Gathering and labeling data to balance out the class label distribution can be both costly and time-consuming. Many existing solutions that enable ensemble learning, re-balancing strategies, or fine-tuning applied to deep neural networks are limited by the inert problem of few class samples across a subset of classes. Recently, vision-language models like CLIP have been observed as effective solutions to zero-shot or few-shot learning by grasping a similarity between vision and language features for image and text pairs. Considering that large pre-trained vision-language models may contain valuable side textual information for minor classes, we propose to leverage text supervision to tackle the challenge of long-tailed learning. Concretely, we propose a novel text-guided mixup technique that takes advantage of the semantic relations between classes recognized by the pre-trained text encoder to help alleviate the long-tailed problem. Our empirical study on benchmark long-tailed tasks demonstrates the effectiveness of our proposal with a theoretical guarantee. Our code is available at this https URL.

Comments:	Accepted by BMVC'24, code is available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2409.03583 [cs.CV]
	(or arXiv:2409.03583v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.03583

Submission history

From: Jiawei Yao [view email]
[v1] Thu, 5 Sep 2024 14:37:43 UTC (6,067 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Text-Guided Mixup Towards Long-Tailed Image Categorization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Text-Guided Mixup Towards Long-Tailed Image Categorization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators