Debiased Learning from Naturally Imbalanced Pseudo-Labels

Wang, Xudong; Wu, Zhirong; Lian, Long; Yu, Stella X.

Computer Science > Machine Learning

arXiv:2201.01490 (cs)

[Submitted on 5 Jan 2022 (v1), last revised 21 Apr 2022 (this version, v2)]

Title:Debiased Learning from Naturally Imbalanced Pseudo-Labels

Authors:Xudong Wang, Zhirong Wu, Long Lian, Stella X. Yu

View PDF

Abstract:Pseudo-labels are confident predictions made on unlabeled target data by a classifier trained on labeled source data. They are widely used for adapting a model to unlabeled data, e.g., in a semi-supervised learning setting.
Our key insight is that pseudo-labels are naturally imbalanced due to intrinsic data similarity, even when a model is trained on balanced source data and evaluated on balanced target data. If we address this previously unknown imbalanced classification problem arising from pseudo-labels instead of ground-truth training labels, we could remove model biases towards false majorities created by pseudo-labels.
We propose a novel and effective debiased learning method with pseudo-labels, based on counterfactual reasoning and adaptive margins: The former removes the classifier response bias, whereas the latter adjusts the margin of each class according to the imbalance of pseudo-labels. Validated by extensive experimentation, our simple debiased learning delivers significant accuracy gains over the state-of-the-art on ImageNet-1K: 26% for semi-supervised learning with 0.2% annotations and 9% for zero-shot learning. Our code is available at: this https URL.

Comments:	Accepted by CVPR 2022
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2201.01490 [cs.LG]
	(or arXiv:2201.01490v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2201.01490

Submission history

From: Xudong Wang [view email]
[v1] Wed, 5 Jan 2022 07:40:24 UTC (3,760 KB)
[v2] Thu, 21 Apr 2022 09:13:11 UTC (5,740 KB)

Computer Science > Machine Learning

Title:Debiased Learning from Naturally Imbalanced Pseudo-Labels

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Debiased Learning from Naturally Imbalanced Pseudo-Labels

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators