Spectral clustering in the Gaussian mixture block model

Li, Shuangping; Schramm, Tselil

Statistics > Machine Learning

arXiv:2305.00979 (stat)

[Submitted on 29 Apr 2023 (v1), last revised 10 Apr 2024 (this version, v3)]

Title:Spectral clustering in the Gaussian mixture block model

Authors:Shuangping Li, Tselil Schramm

View PDF

Abstract:Gaussian mixture block models are distributions over graphs that strive to model modern networks: to generate a graph from such a model, we associate each vertex $i$ with a latent feature vector $u_i \in \mathbb{R}^d$ sampled from a mixture of Gaussians, and we add edge $(i,j)$ if and only if the feature vectors are sufficiently similar, in that $\langle u_i,u_j \rangle \ge \tau$ for a pre-specified threshold $\tau$. The different components of the Gaussian mixture represent the fact that there may be different types of nodes with different distributions over features -- for example, in a social network each component represents the different attributes of a distinct community. Natural algorithmic tasks associated with these networks are embedding (recovering the latent feature vectors) and clustering (grouping nodes by their mixture component).
In this paper we initiate the study of clustering and embedding graphs sampled from high-dimensional Gaussian mixture block models, where the dimension of the latent feature vectors $d\to \infty$ as the size of the network $n \to \infty$. This high-dimensional setting is most appropriate in the context of modern networks, in which we think of the latent feature space as being high-dimensional. We analyze the performance of canonical spectral clustering and embedding algorithms for such graphs in the case of 2-component spherical Gaussian mixtures, and begin to sketch out the information-computation landscape for clustering and embedding in these models.

Comments:	50 pages
Subjects:	Machine Learning (stat.ML); Data Structures and Algorithms (cs.DS); Social and Information Networks (cs.SI); Probability (math.PR); Statistics Theory (math.ST)
Cite as:	arXiv:2305.00979 [stat.ML]
	(or arXiv:2305.00979v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2305.00979

Submission history

From: Shuangping Li [view email]
[v1] Sat, 29 Apr 2023 23:56:55 UTC (47 KB)
[v2] Mon, 25 Mar 2024 07:47:57 UTC (56 KB)
[v3] Wed, 10 Apr 2024 23:06:38 UTC (58 KB)

Statistics > Machine Learning

Title:Spectral clustering in the Gaussian mixture block model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Spectral clustering in the Gaussian mixture block model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators