Optimizing Public Hospital Budgets Using Ensemble Machine Learning and SHAP Analysis for Interpretable Cost Prediction
Abstract
Public hospitals are in a position of growing economic pressure, and frugal resource management is necessary. Unfortunately, most traditional cost forecasting models do not capture healthcare costs' dynamic and non-linear nature. This paper offers a financial optimization framework based on AI with Ensemble Machine learning techniques that are interpretable. This methodology identifies the data preprocessing, feature engineering, and model training with the optimized Random Forest and XGBoost algorithms and SHAP (Shapley Additive exPlanations) analysis for model interpretability. The results report that generating our optimized XGBoost model led to an R² score of 0.89, outperforming Random Forest (R² = 0.88) and our baseline models. It also achieved a Mean Absolute Error (MAE) of 2502.36 and a Mean Squared Error (MSE) of 11230456.12, which is very high in predictive accuracy. Interpretability is achieved using SHAP (Shapley Additive exPlanations) analysis, which identifies key cost-driving factors such as smoking status, BMI, and age, enabling more transparent and informed decision-making by stakeholders. With the framework, we present a scalable predictive budgeting and decision-making solution in public healthcare institutionsReferences
A. Shiwlani, M. Khan, A. M. K. Sherani, M. U. Qayyum, and H. K. Hussain, "REVOLUTIONIZING HEALTHCARE: THE IMPACT OF ARTIFICIAL INTELLIGENCE ON PATIENT CARE, DIAGNOSIS, AND TREATMENT," JURIHUM: Jurnal Inovasi dan Humaniora, vol. 1, no. 5, pp. 779-790, 2024.
K. J. Prabhod, "The Role of Artificial Intelligence in Reducing Healthcare Costs and Improving Operational Efficiency," Quarterly Journal of Emerging Technologies and Innovations, vol. 9, no. 2, pp. 47-59, 2024.
D. Brunner, C. Legat, and U. Seebacher, "Towards Next Generation Data-Driven Management," Collective Intelligence: The Rise of Swarm Systems and their Impact on Society, p. 152, 2024.
N. A. Wani, R. Kumar, J. Bedi, and I. Rida, "Explainable AI-driven IoMT fusion: Unravelling techniques, opportunities, and challenges with Explainable AI in healthcare," Information Fusion, p. 102472, 2024.
A. Vimont, H. Leleu, and I. Durand-Zaleski, "Machine learning versus regression modelling in predicting individual healthcare costs from a representative sample of the nationwide claims database in France," The European Journal of Health Economics, vol. 23, no. 2, pp. 211-223, 2022.
M. Mazumdar et al., "Comparison of statistical and machine learning models for healthcare cost data: a simulation study motivated by Oncology Care Model (OCM) data," BMC health services research, vol. 20, pp. 1-12, 2020.
B. Langenberger, T. Schulte, and O. Groene, "The application of machine learning to predict high-cost patients: A performance-comparison of different models using healthcare claims data," PloS one, vol. 18, no. 1, p. e0279540, 2023.
L. Breiman, "Random forests," Machine learning, vol. 45, pp. 5-32, 2001.
S. Ramraj, N. Uzir, R. Sunil, and S. Banerjee, "Experimenting XGBoost algorithm for prediction and classification of different datasets," International Journal of Control Theory and Applications, vol. 9, no. 40, pp. 651-662, 2016.
S. Nanglia, M. Ahmad, F. A. Khan, and N. Jhanjhi, "An enhanced Predictive heterogeneous ensemble model for breast cancer prediction," Biomedical Signal Processing and Control, vol. 72, p. 103279, 2022.
J. Abdollahi, B. Nouri-Moghaddam, and M. Ghazanfari, "Deep Neural Network Based Ensemble learning Algorithms for the healthcare system (diagnosis of chronic diseases)," arXiv preprint arXiv:2103.08182, 2021.
H. Kwon, J. Park, and Y. Lee, "Stacking ensemble technique for classifying breast cancer," Healthcare informatics research, vol. 25, no. 4, pp. 283-288, 2019.
F. Ali et al., "A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion," Information Fusion, vol. 63, pp. 208-222, 2020.
D. D. Rufo, T. G. Debelee, A. Ibenthal, and W. G. Negera, "Diagnosis of diabetes mellitus using gradient boosting machine (LightGBM)," Diagnostics, vol. 11, no. 9, p. 1714, 2021.
A. Y. Krishna, K. R. Kiran, N. R. Sai, A. Sharma, S. P. Praveen, and J. Pandey, "Ant Colony Optimized XGBoost for Early Diabetes Detection: A Hybrid Approach in Machine Learning," Journal of Intelligent Systems & Internet of Things, vol. 10, no. 2, 2023.
K. Amarasinghe, K. T. Rodolfa, H. Lamba, and R. Ghani, "Explainable machine learning for public policy: Use cases, gaps, and research directions," Data & Policy, vol. 5, p. e5, 2023.
A. Tursunalieva, D. L. Alexander, R. Dunne, J. Li, L. Riera, and Y. Zhao, "Making Sense of Machine Learning: A Review of Interpretation Techniques and Their Applications," Applied Sciences, vol. 14, no. 2, p. 496, 2024.
M. Van der Schaar et al., "How artificial intelligence and machine learning can help healthcare systems respond to COVID-19," Machine Learning, vol. 110, pp. 1-14, 2021.
W. Ding, M. Abdel-Basset, H. Hawash, and A. M. Ali, "Explainability of artificial intelligence methods, applications and challenges: A comprehensive survey," Information Sciences, vol. 615, pp. 238-292, 2022.
N. Rane, S. Choudhary, and J. Rane, "Explainable Artificial Intelligence (XAI) in healthcare: Interpretable Models for Clinical Decision Support," Available at SSRN 4637897, 2023.
M. Liu, Y. Ning, H. Yuan, M. E. H. Ong, and N. Liu, "Balanced background and explanation data are needed in explaining deep learning models with SHAP: An empirical study on clinical decision making," arXiv preprint arXiv:2206.04050, 2022.
M. A. Shakir et al., "Developing Interpretable Models for Complex Decision-Making," in 2024 36th Conference of Open Innovations Association (FRUCT), 2024: IEEE, pp. 66-75.
P. N. Srinivasu, N. Sandhya, R. H. Jhaveri, and R. Raut, "From blackbox to explainable AI in healthcare: existing tools and case studies," Mobile Information Systems, vol. 2022, no. 1, p. 8167821, 2022.
S. Singhal, "Cost optimization and affordable health care using AI," International Machine learning journal and Computer Engineering, vol. 6, no. 6, pp. 1-12, 2023.
A. K. Leist et al., "Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences," Science Advances, vol. 8, no. 42, p. eabk1942, 2022.
J. Amann, "Machine learning in stroke medicine: Opportunities and challenges for risk prediction and prevention," Artificial Intelligence in Brain and Mental Health: Philosophical, Ethical & Policy Issues, pp. 57-71, 2022.
M. Ordu, E. Demir, C. Tofallis, and M. M. Gunal, "A novel healthcare resource allocation decision support tool: A forecasting-simulation-optimization approach," Journal of the operational research society, vol. 72, no. 3, pp. 485-500, 2021.
S. Joshi et al., "Modeling conceptual framework for implementing barriers of AI in public healthcare for improving operational excellence: experiences from developing countries," Sustainability, vol. 14, no. 18, p. 11698, 2022.
D. Patil, N. Rane, P. Desai, and J. Rane, "Machine learning and deep learning: Methods, techniques, applications, challenges, and future research opportunities," Trustworthy Artificial Intelligence in Industry and Society, pp. 28-81, 2024.
J. Rane, S. Mallick, O. Kaya, and N. Rane, "Scalable and adaptive deep learning algorithms for large-scale machine learning systems," Future Research Opportunities for Artificial Intelligence in Industry 4.0 and, vol. 5, pp. 2-40, 2024.
R. Ramya, S. Priya, P. Thamizhikkavi, and M. Anand, "The Pillars of AI Ethics: Transparency, Accountability, and Privacy," in Responsible Implementations of Generative AI for Multidisciplinary Use: IGI Global, 2025, pp. 85-110.
DOI:
https://doi.org/10.31449/inf.v49i22.7981Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika







