Augmentation Invariant Manifold Learning

Wang, Shulei

Statistics > Machine Learning

arXiv:2211.00460v1 (stat)

[Submitted on 1 Nov 2022 (this version), latest version 26 Nov 2023 (v2)]

Title:Augmentation Invariant Manifold Learning

Authors:Shulei Wang

View PDF

Abstract:Data augmentation is a widely used technique and an essential ingredient in the recent advance in self-supervised representation learning. By preserving the similarity between augmented data, the resulting data representation can improve various downstream analyses and achieve state-of-art performance in many applications. To demystify the role of data augmentation, we develop a statistical framework on a low-dimension product manifold to theoretically understand why the unlabeled augmented data can lead to useful data representation. Under this framework, we propose a new representation learning method called augmentation invariant manifold learning and develop the corresponding loss function, which can work with a deep neural network to learn data representations. Compared with existing methods, the new data representation simultaneously exploits the manifold's geometric structure and invariant property of augmented data. Our theoretical investigation precisely characterizes how the data representation learned from augmented data can improve the $k$-nearest neighbor classifier in the downstream analysis, showing that a more complex data augmentation leads to more improvement in downstream analysis. Finally, numerical experiments on simulated and real datasets are presented to support the theoretical results in this paper.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST); Methodology (stat.ME)
Cite as:	arXiv:2211.00460 [stat.ML]
	(or arXiv:2211.00460v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2211.00460

Submission history

From: Shulei Wang [view email]
[v1] Tue, 1 Nov 2022 13:42:44 UTC (622 KB)
[v2] Sun, 26 Nov 2023 19:20:39 UTC (894 KB)

Statistics > Machine Learning

Title:Augmentation Invariant Manifold Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Augmentation Invariant Manifold Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators