Cardiovascular Disease Prediction via Hybrid SVM–SMOTE and Sparse Autoencoder Feature Reduction with Deep MLP Classification
Abstract
Cardiovascular diseases remain the leading global cause of death, demanding diagnostic systems that are accurate, interpretable, and computationally efficient. Traditional machine learning approaches frequently struggle with class imbalance, high-dimensional noise, and restricted generalization in clinical datasets. To tackle such issues, we propose a hybrid framework that combines SVM–SMOTE and neighborhood cleaning rule (NCL) for class rebalancing, a sparse autoencoder (SAE) with random forest (RF) selection for non-linear feature optimization, and a class-weighted multilayer perceptron (MLP) for final classification. We validate our framework on the Z-Alizadeh Sani (54 features) and Cleveland (13 features) datasets under stratified fivefold cross-validation, the model attains mean accuracies of 94.02 ± 2.77 % and 94.36 ± 1.47 %, with AUC–ROC = 0.988 and 0.982, outperforming prior baselines [4, 10, 14] by 7.6%–20.8%, and Bootstrap 95% confidence intervals and McNemar/DeLong tests (p < 0.001) confirms significance. Noteably, the ablation study demonstrates the contribution of each module (e.g., a 12% accuracy improvement without sampling). The optimized MLP reduced false negatives to ~5%, while training 40% faster than CNN–LSTM alternatives. The proposed framework provides a statistically robust and interpretable solution for predicting cardiovascular disease.DOI:
https://doi.org/10.31449/inf.v50i5.10455Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







