Causal-Informed Contrastive Learning: Towards Bias-Resilient Pre-training under Concept Drift

Yang, Xiaoyu; Lu, Jie; Yu, En

Computer Science > Machine Learning

arXiv:2502.07620 (cs)

[Submitted on 11 Feb 2025]

Title:Causal-Informed Contrastive Learning: Towards Bias-Resilient Pre-training under Concept Drift

Authors:Xiaoyu Yang, Jie Lu, En Yu

View PDF HTML (experimental)

Abstract:The evolution of large-scale contrastive pre-training propelled by top-tier datasets has reached a transition point in the scaling law. Consequently, sustaining and enhancing a model's pre-training capabilities in drift environments have surfaced as a notable challenge. In this paper, we initially uncover that contrastive pre-training methods are significantly impacted by concept drift wherein distributions change unpredictably, resulting in notable biases in the feature space of the pre-trained model. Empowered by causal inference, we construct a structural causal graph to analyze the impact of concept drift to contrastive pre-training systemically, and propose the causal interventional contrastive objective. Upon achieving this, we devise a resilient contrastive pre-training approach to accommodate the data stream of concept drift, with simple and scalable implementation. Extensive experiments on various downstream tasks demonstrate our resilient contrastive pre-training effectively mitigates the bias stemming from the concept drift data stream. Codes are available at this https URL.

Comments:	17pages, 3 figures
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2502.07620 [cs.LG]
	(or arXiv:2502.07620v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.07620

Submission history

From: Xiaoyu Yang [view email]
[v1] Tue, 11 Feb 2025 15:09:05 UTC (1,571 KB)

Computer Science > Machine Learning

Title:Causal-Informed Contrastive Learning: Towards Bias-Resilient Pre-training under Concept Drift

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Causal-Informed Contrastive Learning: Towards Bias-Resilient Pre-training under Concept Drift

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators