Optimizing Data Exploration by Unifying Clustering and Association Rule Extraction

Youssef Fakir, Salim Khalil, Hamid Garmani, Mohamed Fakir

Abstract


The extraction of association rules remains a crucial strategy in data analysis, particularly in the context of massive datasets. This method unveils complex relationships, correlations, and meaningful patterns within vast datasets, providing essential insights for decision-making and understanding behaviors. Our approach stands out through the use of clustering algorithms for intelligent data partitioning. This strategic choice establishes a robust foundation for efficient association rule extraction. By organizing data specifically through clustering techniques before applying the extraction algorithm, we aim to optimize the relevance and significance of the discovered rules.

Full Text:

PDF PDF

References


Hassan Ibrahim Hayatu, Abdullahi Mohammed, and Ahmad Barroon Ismaeel. Big data clus-tering techniques: Recent advances and survey. Machine Learning and Data Mining for Emerging Trend in Cyber Dynamics, pages 57–79, 2021. doi:10.1007/ 978-3-030-66288-2_3.

A. K. Jain, A. Topchy, M. H. C. Law, and J. M. Buhmann, “Landscape of clustering algo-rithms,” in Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Jan. 2004.DOI: 10.1109/icpr.2004.1334073

T. Lange, V. Roth, M. L. Braun, and J. M. Buhmann, “Stability-Based Validation of Clus-tering Solutions,” Neural Computation, vol. 16, no. 6, pp. 1299–1323, Jun. 2004. DOI: 10.1162/089976604773717621

F. Li, J. Qin, and Y. Kang, “Closed-Loop Hierarchical Operation for Optimal Unit Com-mitment and Dispatch in Microgrids: A Hybrid System Approach,” arXiv.org, Dec. 2018. [Online]. Available: https://arxiv.org/abs/1812.09928

H. Yin, A. Aryani, S. Petrie, A. Nambissan, A. Astudillo, and S. Cao, “A Rapid Review of Clustering Algorithms,” arXiv.org, Jan. 2024. [Online]. Available: https://arxiv.org/abs/2401.07389

M. H. Hansen and B. Yu, “Model Selection and the Principle of Minimum Description Length,” Journal of the American Statistical Association, vol. 96, no. 454, pp. 746–774, Jun. 2001. DOI: 10.1198/016214501753168398

B. G. Tabachnick and L. S. Fidell, Using multivariate statistics, 6th ed. Harlow, England: Pearson Education Limited, 2013.

C. C. Aggarwal, C. K. Reddy, and T. Francis, Data Clustering Algorithms and Applications. Boca Raton, FL: Chapman And Hall/CRC, 2018.

H. Hu, J. Liu, X. Zhang, and M. Fang, “An Effective and Adaptable K-means Algorithm for Cluster Analysis,” Pattern Recognition, p. 109404, Feb. 2023. DOI: 10.1016/j.patcog.2023.109404

K. Sharma, S. Saini, S. Sharma, H. S. Kang, M. Bouye, and D. Krah, “Big Data Analytics Model for Distributed Document Using Hybrid Optimization with K-Means Clustering,” Wireless Communications and Mobile Computing, vol. 2022, p. e5807690, Jun. 2022. DOI: 10.1155/2022/5807690

S. Guha, R. Rastogi, and K. Shim, “Rock: A robust clustering algorithm for categorical at-tributes,” In- formation Systems, vol. 25, no. 5, pp. 345–366, Jul. 2000. DOI: 10.1016/s0306-4379(00)00022-3

A. K. Jain, “Data clustering: 50 years beyond K- means,” Pattern Recognition Letters, vol. 31, no. 8, pp. 651–666, Jun. 2010. DOI: 10.1016/j.patrec.2009.09.011

M. L. Yiu and N. Mamoulis, “Clustering objects on a spatial network,” Jan. 2004. DOI: 10.1145/ 1007568.1007619




DOI: https://doi.org/10.31449/inf.v49i18.5964

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.