A Multiscale Environment for Learning by Diffusion

Murphy, James M.; Polk, Sam L.

Computer Science > Machine Learning

arXiv:2102.00500 (cs)

[Submitted on 31 Jan 2021]

Title:A Multiscale Environment for Learning by Diffusion

Authors:James M. Murphy, Sam L. Polk

View PDF

Abstract:Clustering algorithms partition a dataset into groups of similar points. The clustering problem is very general, and different partitions of the same dataset could be considered correct and useful. To fully understand such data, it must be considered at a variety of scales, ranging from coarse to fine. We introduce the Multiscale Environment for Learning by Diffusion (MELD) data model, which is a family of clusterings parameterized by nonlinear diffusion on the dataset. We show that the MELD data model precisely captures latent multiscale structure in data and facilitates its analysis. To efficiently learn the multiscale structure observed in many real datasets, we introduce the Multiscale Learning by Unsupervised Nonlinear Diffusion (M-LUND) clustering algorithm, which is derived from a diffusion process at a range of temporal scales. We provide theoretical guarantees for the algorithm's performance and establish its computational efficiency. Finally, we show that the M-LUND clustering algorithm detects the latent structure in a range of synthetic and real datasets.

Comments:	35 pages, 10 figures
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Probability (math.PR); Machine Learning (stat.ML)
Cite as:	arXiv:2102.00500 [cs.LG]
	(or arXiv:2102.00500v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2102.00500

Submission history

From: Sam Polk [view email]
[v1] Sun, 31 Jan 2021 17:46:19 UTC (43,964 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-02

Change to browse by:

cs
cs.CV
math
math.PR
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

James M. Murphy

export BibTeX citation

Computer Science > Machine Learning

Title:A Multiscale Environment for Learning by Diffusion

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A Multiscale Environment for Learning by Diffusion

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators