Optimizing Data Exploration by Unifying Clustering and Association Rule Extraction
Abstract
The extraction of association rules remains a crucial strategy in data analysis, particularly in the context of massive datasets. This method unveils complex relationships, correlations, and meaningful patterns within vast datasets, providing essential insights for decision-making and understanding behaviors. Our approach stands out through the use of clustering algorithms for intelligent data partitioning. This strategic choice establishes a robust foundation for efficient association rule extraction. By organizing data specifically through clustering techniques before applying the extraction algorithm, we aim to optimize the relevance and significance of the discovered rules.References
Hassan Ibrahim Hayatu, Abdullahi Mohammed, and Ahmad Barroon Ismaeel. Big data clus-tering techniques: Recent advances and survey. Machine Learning and Data Mining for Emerging Trend in Cyber Dynamics, pages 57–79, 2021. doi:10.1007/ 978-3-030-66288-2_3.
A. K. Jain, A. Topchy, M. H. C. Law, and J. M. Buhmann, “Landscape of clustering algo-rithms,” in Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Jan. 2004.DOI: 10.1109/icpr.2004.1334073
T. Lange, V. Roth, M. L. Braun, and J. M. Buhmann, “Stability-Based Validation of Clus-tering Solutions,” Neural Computation, vol. 16, no. 6, pp. 1299–1323, Jun. 2004. DOI: 10.1162/089976604773717621
F. Li, J. Qin, and Y. Kang, “Closed-Loop Hierarchical Operation for Optimal Unit Com-mitment and Dispatch in Microgrids: A Hybrid System Approach,” arXiv.org, Dec. 2018. [Online]. Available: https://arxiv.org/abs/1812.09928
H. Yin, A. Aryani, S. Petrie, A. Nambissan, A. Astudillo, and S. Cao, “A Rapid Review of Clustering Algorithms,” arXiv.org, Jan. 2024. [Online]. Available: https://arxiv.org/abs/2401.07389
M. H. Hansen and B. Yu, “Model Selection and the Principle of Minimum Description Length,” Journal of the American Statistical Association, vol. 96, no. 454, pp. 746–774, Jun. 2001. DOI: 10.1198/016214501753168398
B. G. Tabachnick and L. S. Fidell, Using multivariate statistics, 6th ed. Harlow, England: Pearson Education Limited, 2013.
C. C. Aggarwal, C. K. Reddy, and T. Francis, Data Clustering Algorithms and Applications. Boca Raton, FL: Chapman And Hall/CRC, 2018.
H. Hu, J. Liu, X. Zhang, and M. Fang, “An Effective and Adaptable K-means Algorithm for Cluster Analysis,” Pattern Recognition, p. 109404, Feb. 2023. DOI: 10.1016/j.patcog.2023.109404
K. Sharma, S. Saini, S. Sharma, H. S. Kang, M. Bouye, and D. Krah, “Big Data Analytics Model for Distributed Document Using Hybrid Optimization with K-Means Clustering,” Wireless Communications and Mobile Computing, vol. 2022, p. e5807690, Jun. 2022. DOI: 10.1155/2022/5807690
S. Guha, R. Rastogi, and K. Shim, “Rock: A robust clustering algorithm for categorical at-tributes,” In- formation Systems, vol. 25, no. 5, pp. 345–366, Jul. 2000. DOI: 10.1016/s0306-4379(00)00022-3
A. K. Jain, “Data clustering: 50 years beyond K- means,” Pattern Recognition Letters, vol. 31, no. 8, pp. 651–666, Jun. 2010. DOI: 10.1016/j.patrec.2009.09.011
M. L. Yiu and N. Mamoulis, “Clustering objects on a spatial network,” Jan. 2004. DOI: 10.1145/ 1007568.1007619
DOI:
https://doi.org/10.31449/inf.v49i18.5964Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







