Mathematics > Probability
[Submitted on 24 Mar 2025 (v1), last revised 25 Mar 2025 (this version, v2)]
Title:Hierarchical Clustering Algorithms on Poisson and Cox Point Processes
View PDF HTML (experimental)Abstract:Clustering is a widely used technique in unsupervised learning to identify groups within a dataset based on the similarities between its elements. This paper introduces three new hierarchical clustering models, Clustroid Hierarchical Nearest Neighbor ($\mathrm{CHN}^2$), Single Linkage Hierarchical Nearest Neighbor ($\mathrm{SHN}^2$), and Hausdorff (Complete Linkage) Hierarchical Nearest Neighbor ($\mathrm{H}^2\mathrm{N}^2$), all designed for datasets with a countably infinite number of points. These algorithms proceed through multiple levels of clustering and construct clusters by connecting nearest-neighbor points or clusters, but differ in the distance metrics they employ (clustroid, single linkage, or Hausdorff, respectively). Each method is first applied to the homogeneous Poisson point process on the Euclidean space, where it defines a phylogenetic forest, which is a factor of the point process and therefore unimodular. The results established for the $\mathrm{CHN}^2$ algorithm include the almost-sure finiteness of the clusters and bounds on the mean cluster size at each level of the algorithm. The mean size of the typical cluster is shown to be infinite. Moreover, the limiting structure of all three algorithms is examined as the number of levels tends to infinity, and properties such as the one-endedness of the limiting connected components are derived. In the specific case of $\mathrm{SHN}^2$ on the Poisson point process, the limiting graph is shown to be a subgraph of the Minimal Spanning Forest. The $\mathrm{CHN}^2$ algorithm is also extended beyond the Poisson setting, to certain stationary Cox point processes. Similar finite-cluster properties are shown to hold in these cases. It is also shown that efficient detection of Cox-triggered aggregation can be achieved through this clustering algorithm.
Submission history
From: Sayeh Khaniha [view email][v1] Mon, 24 Mar 2025 11:06:36 UTC (7,356 KB)
[v2] Tue, 25 Mar 2025 10:14:25 UTC (7,356 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.