Opening the black-box of Neighbor Embedding with Hotelling's T2 statistic and Q-residuals

Rainer, Roman Josef; Mayr, Michael; Himmelbauer, Johannes; Nikzad-Langerodi, Ramin

Statistics > Machine Learning

arXiv:2209.01984 (stat)

[Submitted on 5 Sep 2022]

Title:Opening the black-box of Neighbor Embedding with Hotelling's T2 statistic and Q-residuals

Authors:Roman Josef Rainer, Michael Mayr, Johannes Himmelbauer, Ramin Nikzad-Langerodi

View PDF

Abstract:In contrast to classical techniques for exploratory analysis of high-dimensional data sets, such as principal component analysis (PCA), neighbor embedding (NE) techniques tend to better preserve the local structure/topology of high-dimensional data. However, the ability to preserve local structure comes at the expense of interpretability: Techniques such as t-Distributed Stochastic Neighbor Embedding (t-SNE) or Uniform Manifold Approximation and Projection (UMAP) do not give insights into which input variables underlie the topological (cluster) structure seen in the corresponding embedding. We here propose different "tricks" from the chemometrics field based on PCA, Q-residuals and Hotelling's T2 contributions in combination with novel visualization approaches to derive local and global explanations of neighbor embedding. We show how our approach is capable of identifying discriminatory features between groups of data points that remain unnoticed when exploring NEs using standard univariate or multivariate approaches.

Comments:	16 pages, 10 figure
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2209.01984 [stat.ML]
	(or arXiv:2209.01984v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2209.01984

Submission history

From: Ramin Nikzad-Langerodi Dr. [view email]
[v1] Mon, 5 Sep 2022 14:33:42 UTC (45,150 KB)

Statistics > Machine Learning

Title:Opening the black-box of Neighbor Embedding with Hotelling's T2 statistic and Q-residuals

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Opening the black-box of Neighbor Embedding with Hotelling's T2 statistic and Q-residuals

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators