OLAP Mining with Educational Data Mart to Predict Students’ Performance

Ihab Ahmed Najm, Jasim Mohammed Dahr, Alaa Khalaf Hamoud, Ali Salah Alasady, Wid Akeel Awadh, Mohammed B. M. Kamel, Aqeel Majeed Humadi

Abstract


Academic institutions always try to use a solid platform for supporting their short-to-long term decisions related to academic performance. These platforms utilize the historical data and turn them into strategic decisions. The hidden patterns in the data need tools and approaches to be discovered. This paper aims to present a short roadmap for implementing educational data mart based on a data set form Alexandria Private Elementary School, located in Basrah province of Iraq in the 2017-2018 academic year. The educational data mart is implemented, then the cube is constructed to perform OLAP operations and present OLAP reports. Next, OLAP mining is performed on the educational cube using nine algorithms, namely: decision tree with score method (entropy) and split method (complete)), decision tree with score method (entropy) and split method (complete)), decision tree with score method (entropy) and split method (both)), Logistic, Naïve Bayes, Neural Network, clustering with expectation maximization, clustering with K-means clustering, and Association rules mining. According to comparison of all algorithms, clustering with expectation maximization proved the highest accuracy with 96.76% for predicting the students’ performance and 96.12% for predicting students grades amongst all other algorithms

Full Text:

PDF

References


S. N. Dhamdhere, "Importance of knowledge management in the higher educational institutes," Turkish Online Journal of Distance Education, vol. 16, pp. 162-183, 2015.

R. Kimball and M. Ross, The data warehouse toolkit: the complete guide to dimensional modeling: John Wiley & Sons, 2011.

W. H. Inmon, Building the data warehouse: John wiley & sons, 2005.

I. Moalla, A. Nabli, L. Bouzguenda, and M. Hammami, "Data warehouse design approaches from social media: review and comparison," Social Network Analysis and Mining, vol. 7, p. 5, 2017.

A. Hamoud and T. Obaid, "Building Data Warehouse for Diseases Registry: First step for Clinical Data Warehouse," International Journal of Scientific & Engineering Research, vol. 4, pp. 636-640, 2013.

I. Teotonio, M. Cabral, C. O. Cruz, and C. M. Silva, "Decision support system for green roofs investments in residential buildings," Journal of Cleaner Production, vol. 249, p. 119365, 2020.

J. Caserta and R. Kimball, The Data Warehouseetl Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data: Wiley, 2013.

S. Thulasiram and N. Ramaiah, "Real Time Data Warehouse Updates Through Extraction-Transformation-Loading Process Using Change Data Capture Method," in International Conference on Computer Networks and Inventive Communication Technologies, 2019, pp. 552-560.

Y. Zhu, "A Data Driven Educational Decision Support System," International Journal of Emerging Technologies in Learning (iJET), vol. 13, pp. 4-16, 2018.

A. Khalaf Hamoud, H. Noori Hussien, A. Akram Fadhil, and Z. Raad Ekal, "Improving Service Quality Using Consumers’ Complaints Data Mart which Effect on Financial Customer Satisfaction," in Journal of Physics Conference Series, 2020, p. 012060.

A. Khalaf Hamoud, M. A. Ulkareem, H. Noori Hussain, Z. Abdulkareem Mohammed, and G. Mustafa Salih, "Improve HR Decision-Making Based On Data Mart and OLAP," in Journal of Physics Conference Series, 2020, p. 012058.

L. W. Santoso, "Data warehouse with big data technology for higher education," Procedia Computer Science, vol. 124, pp. 93-99, 2017.

W. N. Price, S. Gerke, and I. G. Cohen, "Potential liability for physicians using artificial intelligence," Jama, vol. 322, pp. 1765-1766, 2019.

A. Hamoud, "Applying association rules and decision tree algorithms with tumor diagnosis data," International Research Journal of Engineering and Technology, vol. 3, pp. 27-31, 2017.

A. Hamoud and T. Obaid, "Using OLAP with Diseases Registry Warehouse for Clinical Decision Support," International Journal of Computer Science and Mobile Computing, vol. 3, pp. 39-49, 2014.

A. Hamoud, A. S. Hashim, and W. A. Awadh, "Clinical data warehouse: a review," Iraqi Journal for Computers and Informatics, vol. 44, 2018.

A. Hamoud, H. Adday, T. Obaid, and R. Hameed, "Design and Implementing Cancer Data Warehouse to Support Clinical Decisions," International Journal of Scientific & Engineering Research, vol. 7, pp. 1271-1285, 2016.

A. Khalaf Hamoud, M. A. Ulkareem, H. Noori Hussain, Z. Abdulkareem Mohammed, and G. Mustafa Salih, "Improve HR Decision-Making Based On Data Mart and OLAP," JPhCS, vol. 1530, p. 012058, 2020.

A. K. Hamoud, H. N. Hussien, A. A. Fadhil, and Z. R. Ekal, "Improving Service Quality Using Consumers' Complaints Data Mart which Effect on Financial Customer Satisfaction," in Imam Al-Kadhum International Conference for Modern Applications of Information and Communication Technology (MAICT), 2020, p. 012060.

A. S. Girsang, G. Arisandi, C. Elysisa, and M. H. Saragih, "Decision support system using data warehouse for retail system," in Journal of Physics: Conference Series, 2019, p. 012007.

R. Katuwal, P. N. Suganthan, and L. Zhang, "An ensemble of decision trees with random vector functional link networks for multi-class classification," Applied Soft Computing, vol. 70, pp. 1146-1153, 2018.

A. Hamoud, A. S. Hashim, and W. A. Awadh, "Predicting student performance in higher education institutions using decision tree analysis," International Journal of Interactive Multimedia and Artificial Intelligence, vol. 5, pp. 26-31, 2018.

A. Hamoud, "Selection of best decision tree algorithm for prediction and classification of students’ action," American International Journal of Research in Science, Technology, Engineering & Mathematics, vol. 16, pp. 26-32, 2016.

A. Hamoud, A. Humadi, W. A. Awadh, and A. S. Hashim, "Students’ success prediction based on Bayes algorithms," International Journal of Computer Applications, vol. 178, pp. 6-12, 2017.

A. K. Hamoud and A. M. Humadi, "Student’s Success Prediction Model Based on Artificial Neural Networks (ANN) and A Combination of Feature Selection Methods," Journal of Southwest Jiaotong University, vol. 54, 2019.

A. K. HAMOUD, "CLASSIFYING STUDENTS'ANSWERS USING CLUSTERING ALGORITHMS BASED ON PRINCIPLE COMPONENT ANALYSIS," Journal of Theoretical & Applied Information Technology, vol. 96, 2018.

A. S. Hashima, A. K. Hamoud, and W. A. Awadh, "Analyzing students’ answers using association rule mining based on feature selection," Journal of Southwest Jiaotong University, vol. 53, 2018.

A. S. Hashim, W. A. Awadh, and A. K. Hamoud, "Student Performance Prediction Model based on Supervised Machine Learning Algorithms," in IOP Conference Series: Materials Science and Engineering, 2020, p. 032019.

A. Hamoud and T. A. S. Obaid, "Design and Implementation Data Warehouse to Support Clinical Decisions Using OLAP and KPI," Department of Computer Science, University of Basrah, 2013.

G. Roccasalva, "Towards a DSS: A Toolkit for Processes of Co-designing," in Project and Design Literacy as Cornerstones of Smart Education, ed: Springer, 2020, pp. 49-52.

H. Wang, P. Huang, and X. Chen, "Research and Application of a Multidimensional Association Rules Mining Method Based on OLAP," International Journal of Information Technology and Web Engineering (IJITWE), vol. 16, pp. 75-94, 2021.

K. Letrache, O. El Beggar, and M. Ramdani, "OLAP cube partitioning based on association rules method," Applied Intelligence, vol. 49, pp. 420-434, 2019.

A. Yadav, "Improving the Performance of Multidimensional Clinical Data for OLAP using an Optimized Data Clustering approach," Turkish Journal of Computer and Mathematics Education (TURCOMAT), vol. 12, pp. 3269-3275, 2021.

J. García-Tobar, "Study of Indoor Radon Using Data Mining Models Based on OLAP Cubes," Physical Science International Journal, pp. 53-61, 2020.

X.-H. Zhou and X.-M. Zhang, "The application of OLAP and Data mining technology in the analysis of book lending," in 2017 2nd International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2017), 2017.

A. Lamani, B. Erraha, M. Elkyal, and A. Sair, "Data mining techniques application for prediction in OLAP cube," International Journal of Electrical & Computer Engineering (2088-8708), vol. 9, 2019.

D. Putri and I. Sitanggang, "Clustering module in OLAP for horticultural crops using SpagoBI," in IOP Conference Series: Earth and Environmental Science, 2017, p. 012001.

M. I. Moly, O. Roy, and M. A. Hossain, "An Advanced ETL Technique for Error Free Data in Data Warehousing Environment," 2019.

N. Biswas, A. Sarkar, and K. C. Mondal, "Efficient incremental loading in ETL processing for real-time data integration," Innovations in Systems and Software Engineering, vol. 16, pp. 53-61, 2020.

T. M. Al Taleb, S. Hasan, and Y. Y. Mahd, "On-line Analytical Processing (OLAP) Operation for Outpatient Healthcare," Iraqi Journal of Science, pp. 225-231, 2021.

E. Pourabbas, "Providing accurate answers to OLAP queries based on standardized moments of data cubes," Information Systems, vol. 94, p. 101588, 2020.

R. Rad, Microsoft SQL Server 2014 Business Intelligence Development Beginner’s Guide: Packt Publishing Ltd, 2014.

M. Russo, "SSAS tabular as analytical engine," SQLBI Article, 2014.

J. Cheng and R. Greiner, "Comparing Bayesian network classifiers," arXiv preprint arXiv:1301.6684, 2013.

J. R. L. a. G. G. Koch, "The Measurement of Observer Agreement for Categorical Data," Biometrics, vol. 33, pp. 159-174, March 1977.




DOI: https://doi.org/10.31449/inf.v46i5.3853

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.