A Hybrid Deep Learning Framework for Cardiovascular Risk Prediction Using Temporal Embeddings, Ensemble Learning, and Bayesian Uncertainty Estimation

Abstract

This study presents a new hybrid deep learning framework that predicts the risk of cardiovascular disease (CVD) by combining different techniques into one system. The methods used in the study are Long Short- Term Memory (LSTM) autoencoders for temporal representation learning, hybrid feature fusion, stacked ensemble learning, and uncertainty estimation via Bayesian methods. The proposed framework is to be used for the early CVD risk stratification in order to achieve better predictive performance, clinical acceptability and interpretability. The data source was the famous Framingham Heart Study dataset with 4,240 records and 16 clinical variables. The preprocessing steps performed were Hampel filtering for outlier removal, mean imputation for missing value treatment and Min-Max normalization. In addition, the use of Principal Component Analysis (PCA) facilitated the retention of the most important components which explain the highest variance. In order to create a risk evolution scenario, a synthetic temporal sequence was produced and then passed through the LSTM autoencoder, resulting in 32-dimensional latent features. The temporal embeddings were concatenated with the PCA components to create a 41- dimensional hybrid feature space. The problem of class imbalance was solved through the use of a Synthetic Minority Over-Sampling Technique (SMOTE). A stacked ensemble classifier was composed of eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), and Gradient Boosting as base learners, and a Multilayer Perceptron (MLP) was trained as a meta-learner. For uncertainty quantification, a separate Bayesian MLP model using Monte Carlo Dropout was created. The stacked model performed with 96.06% accuracy, 97.67% recall, and 99.31% Area Under the Curve - Receiver Operating Characteristic, thus surpassing single classifiers. Bayesian analysis produced a mean predictive uncertainty of 0.087. Stratified risk assessment disclosed clinically relevant clusters with a high degree of correspondence between the predicted and actual CVD incidence. This interpretable concurrent AI model provides accurate CVD risk prediction that is suitable for daily clinical and wearable monitoring use.

Author Biography

Jeena Joseph, Marian College Kuttikkanam Autonomous, Idukki, Kerala, India

Assistant Professor Department of Computer Applications Marian College Kuttikkanam Autonomous

Authors

  • Jeena Joseph Marian College Kuttikkanam Autonomous, Idukki, Kerala, India
  • K Kartheeban Kalasalingam Academy of Research and Education, Krishnankoil, Tamilnadu, India

DOI:

https://doi.org/10.31449/inf.v50i1.13040

Downloads

Published

04/13/2026

How to Cite

Joseph, J., & Kartheeban, K. (2026). A Hybrid Deep Learning Framework for Cardiovascular Risk Prediction Using Temporal Embeddings, Ensemble Learning, and Bayesian Uncertainty Estimation. Informatica, 50(1). https://doi.org/10.31449/inf.v50i1.13040

Issue

Section

Regular papers