Ensemble Feature Fusion of VGG16, ResNet50, and Vision Transformer for Pneumonia Detection in Chest X-ray Images

Deepa A B; Varghese Paul

doi:10.31449/inf.v50i12.9647

Ensemble Feature Fusion of VGG16, ResNet50, and Vision Transformer for Pneumonia Detection in Chest X-ray Images

Abstract

This study proposes a novel heterogeneous ensemble deep learning architecture for pneumonia classifica- tion from chest X-ray images by integrating pretrained convolutional neural networks(CNN), VGG16 and ResNet50 with a fine-tuned vision transformer (ViT). The model employs a feature-level fusion strategy that concatenates deep local spatial features extracted by the CNN backbones and feeds them into the ViT to capture global contextual relationships via self-attention. This design effectively addresses the limitations of standalone CNN and ViT models by synergistically combining their complementary strengths. Extensive ablation studies and experimental evaluations demonstrate that the ensemble model significantly outper- forms individual CNN and ViT baseline models, achieving an accuracy of 98.5%, precision of 98.7%, recall of 98.3%, F1-score of 98.5%, and an area under the receiver operating characteristic (AUC-ROC) curve of 0.99 on the pneumonia X-ray dataset. The architecture balances detailed local feature extraction and holistic global context modelling, offering a robust and efficient solution for medical image classification.

Author Biographies

Deepa A B, Rajagiri School of Engineering & Technology, APJ Abdul Kalam Technological University, Kakkanad, Kerala, 682039, India and College of Engineering & Management, APJ Abdul Kalam Technological University, Punnapra, Kerala, 688003, India

Research Scholar
Varghese Paul, Rajagiri School of Engineering & Technology, APJ Abdul Kalam Technological University, Kakkanad, Kerala, 682039, India

Professor

References

Authors

Deepa A B Rajagiri School of Engineering & Technology, APJ Abdul Kalam Technological University, Kakkanad, Kerala, 682039, India and College of Engineering & Management, APJ Abdul Kalam Technological University, Punnapra, Kerala, 688003, India
Varghese Paul Rajagiri School of Engineering & Technology, APJ Abdul Kalam Technological University, Kakkanad, Kerala, 682039, India

DOI:

https://doi.org/10.31449/inf.v50i12.9647

Downloads

Published

05/13/2026

Issue

Vol. 50 No. 12 (2026): Online-only issue

Section

Online-only

License

Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.

All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.

Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.

How to Cite

Ensemble Feature Fusion of VGG16, ResNet50, and Vision Transformer for Pneumonia Detection in Chest X-ray Images. (2026). Informatica, 50(12). https://doi.org/10.31449/inf.v50i12.9647

Download Citation

Ensemble Feature Fusion of VGG16, ResNet50, and Vision Transformer for Pneumonia Detection in Chest X-ray Images

Abstract

Author Biographies

References

Authors

DOI:

Downloads

Published

Issue

Section

License

How to Cite

Developed By

Information