Accelerated DBSCAN via parallel, density-aware multi-objective genetic optimization

Document Type : Research Article

Authors

Department of Computer Science, Tarbiat Modares University, Tehran, Iran

Abstract

Clustering is a fundamental task in data mining, where the quality of results often hinges on effective parameter selection. DBSCAN is widely used for discovering clusters of arbitrary shapes but is highly sensitive to its input parameters \textit{Eps} and \textit{MinPts}. This paper proposes an enhanced version of the Multi-Objective Genetic Algorithm for DBSCAN, termed \textbf{Enhanced MOGA-DBSCAN}, which introduces a modified objective function based on a density-aware Outlier Index and accelerates the optimization process through parallel computation. We evaluate the proposed method using two benchmark datasets and compare it against the original MOGA-DBSCAN as well as two adaptive variants: AMD-DBSCAN and WOA-DBSCAN. Results show that Enhanced MOGA-DBSCAN consistently achieves superior clustering performance, as measured by Rand Index and Normalized Mutual Information (NMI), while also reducing runtime relative to the original MOGA-DBSCAN. These findings highlight the effectiveness of our enhancements in improving both clustering quality and computational efficiency.

Keywords

Main Subjects


[1] R.J. Campello, D. Moulavi, A. Zimek, J. Sander, Hierarchical density estimates for data clustering,
visualization, and outlier detection, ACM Trans. Knowl. Discov. Data 10 (2015) 1–51.
[2] D. Deng, DBSCAN clustering algorithm based on density, in: Proc. 7th Int. Forum Electr. Eng.
Autom. (IFEEA), IEEE, 2020, pp. 949–953.
[3] A. Dudek, Silhouette index as clustering evaluation tool, in: Classification and Data Analysis:
Theory and Applications 28, Springer, 2020, pp. 19–33.
[4] H. Eyvazi, A. Rajaei, Enhanced MOGA-DBSCAN: Improving clustering quality with a density-
adjusted outlier index, in: Proc. 1st Int. Conf. Mach. Learn. Knowl. Discov. (MLKD), 2024, pp.
323–328.
[5] Z. Falahiazar, A. Bagheri, M. Reshadi, Determining the parameters of DBSCAN automatically
using the multi-objective genetic algorithm, J. Inf. Sci. Eng. 37 (2021) 157–183.
[6] K. Li, X. Gao, X. Jia, B. Xue, S. Fu, Z. Liu, X. Huang, Z. Huang, Detection of local and clustered
outliers based on the density–distance decision graph, Eng. Appl. Artif. Intell. 110 (2022) 104719.
[7] Y. Liu, G. Yin, The Delaunay triangulation learner and its ensembles, Comput. Statist. Data Anal.
152 (2020) 107030.
[8] Y. Ren, J. Pu, Z. Yang, J. Xu, G. Li, X. Pu, S. Y. Philip, L. He, Deep clustering: A comprehensive
survey, IEEE Trans. Neural Netw. Learn. Syst. (2024) (in press).
[9] E. Schubert, J. Sander, M. Ester, H.-P. Kriegel, X. Xu, DBSCAN revisited, revisited: Why and how
you should (still) use DBSCAN, ACM Trans. Database Syst. 42 (2017) 1–21.
[10] H.V. Singh, A. Girdhar, S. Dahiya, A literature survey based on DBSCAN algorithms, in: Proc. 6th
Int. Conf. Intell. Comput. Control Syst. (ICICCS), IEEE, 2022, pp. 751–758.
[11] S. Verma, M. Pant, V. Snasel, A comprehensive review on NSGA-II for multi-objective combinato-
rial optimization problems, IEEE Access 9 (2021) 57757–57791.
[12] Z. Wang, Z. Ye, Y. Du, Y. Mao, Y. Liu, Z. Wu, J. Wang, AMD-DBSCAN: An adaptive multi-density
DBSCAN for datasets of extremely variable density, in: Proc. IEEE 9th Int. Conf. Data Sci. Adv.
Anal. (DSAA), 2022, pp. 1–10.
[13] X. Xu, S. Ding, L. Wang, Y. Wang, A robust density peaks clustering algorithm with density-
sensitive similarity, Knowl.-Based Syst. 200 (2020) 106028.
[14] X. Zhang, S. Zhou, WOA-DBSCAN: Application of whale optimization algorithm in DBSCAN pa-
rameter adaption, IEEE Access 11 (2023) 91861–91878.
[15] Y. Zhou, W. Zhang, J. Kang, X. Zhang, X. Wang, A problem-specific non-dominated sorting genetic
algorithm for supervised feature selection,Inf. Sci. 547 (2021) 841–859.