HyEMST: A novel hybrid ellipsoidal framework for robust clustering via maximum spanning trees

Document Type : Research Article

Authors

Tarbiat Modares University of Tehran

Abstract

Clustering arbitrary-shaped clusters with heterogeneous densities presents a fundamental challenge in unsupervised learning. Traditional approaches emphasize either geometric distance or local density estimation, yet rarely reconcile both perspectives systematically. This paper introduces HyEMST (Hybrid Ellipsoidal Maximum Spanning Tree), a principled framework that unifies distance and density information through an explicit trade-off parameter λ ∈ [0,1]. The proposed methodology comprises five phases: (1) strategic geometric decomposition via K-Means over-segmentation; (2) robust volumetric density estimation using adaptive ridge-regularized covariance; (3) hybrid kernel construction integrating distance and density affinities; (4) topological structure discovery via maximum spanning tree; and (5) adaptive density-aware cluster merging. Theoretically, we establish that regularized covariance-based density estimation preserves density ranking with > 90% accuracy, ensuring reliable merging even for ill-conditioned micro-clusters. Computationally, the approach achieves O(N d2 ) overall complexity. Empirically, HyEMST attains perfect or near-perfect clustering on synthetic benchmarks and demonstrates superior performance compared to representative baselines on real-world datasets. Ablation studies validate the necessity of hybrid integration and confirm the efficacy of each algorithmic component.

Keywords

Main Subjects