SwinEff-DR: Hybrid Swin Transformer & Efficient Net Architecture for Multi-Scale Diabetic Retinopathy Detection
Abstract
Diabetic Retinopathy (DR) remains one of the leading causes of preventable blindness worldwide, underscoring the need for early detection and accurate classification. The root cause of this disease is diabetes mellitus. According to the WHO, about 537 million people suffer from the disease and are expected to increase to 783 million by 2047. Diabetic retinopathy (DR) remains an incurable condition; however, early detection can significantly restrict the progression of vision loss. Routine ophthalmic examinations and continuous monitoring play a critical role in preventing blindness associated with DR. Consequently, a pressing need exists for advanced computer-assisted diagnostic systems capable of accurately detecting and grading DR, providing valuable support to ophthalmologists for timely intervention. To address this, we propose SwinEff-DR, a hybrid architecture of Swin transformers with Efficient Network to achieve robust and precise DR classification. The advanced preprocessing on the EyePACS dataset will be performed, then the Swin Transformer as a backbone of the model, followed by Efficient Net. The SwinEff-DR model attains 0.96 precision, 0.97 recall, 0.97 accuracy, and 0.97 F1-score, achieving a 1.49% improvement over existing methods. Furthermore, the framework aligns predictions with standardised severity grading, enabling robust and clinically meaningful diagnostic support.
Full Text:
PDFReferences
WHO – Global report on diabetes.
American Diabetes Association – Diabetic Retinopathy clinical facts.
Gulshan, V., Peng, L., Coram, M., Stumpe, M. C., Wu, D., Narayanaswamy, A., ... & Webster, D. R. (2016). Development and validation of a deep learning algorithm for the detection of diabetic retinopathy in retinal fundus photographs. jama, 316(22), 2402-2410.
Pratt, H., Coenen, F., Broadbent, D. M., Harding, S. P., & Zheng, Y. (2016). Convolutional neural networks for diabetic retinopathy. Procedia computer science, 90, 200-205.
Ting, D. S., Cheung, C. Y., Nguyen, Q., Sabanayagam, C., Lim, G., Lim, Z. W., ... & Wong, T. Y. (2019). Deep learning in estimating prevalence and systemic risk factors for diabetic retinopathy: a multi-ethnic study. Npj Digital Medicine, 2(1), 24.
Quellec, G., Russell, S. R., & Abràmoff, M. D. (2010). Optimal filter framework for automated, instantaneous detection of lesions in retinal images. IEEE Transactions on medical imaging, 30(2), 523-533.
Mosquera, C., Ferrer, L., Milone, D. H., Luna, D., & Ferrante, E. (2024). Class imbalance on medical image classification: towards better evaluation practices for discrimination and calibration performance. European Radiology, 34(12), 7895-7903.
Lepetit-Aimon, G., Playout, C., Boucher, M. C., Duval, R., Brent, M. H., & Cheriet, F. (2024). MAPLES-DR: Messidor anatomical and pathological labels for explainable screening of diabetic retinopathy. Scientific Data, 11(1), 914.
He, C., Cao, Y., Yang, Y., Liu, Y., Liu, X., & Cao, Z. (2023). Fault diagnosis of rotating machinery based on the improved multidimensional normalization ResNet. IEEE Transactions on Instrumentation and Measurement, 72, 1-11.
Juola, P. (2022). Ensemble Methods. In Encyclopedia of Big Data (pp. 437-438). Cham: Springer International Publishing.
Ghosh, D. & Chatterjee, A. (2023). Transfer-Ensemble Learning based Deep Convolutional Neural Networks for Diabetic Retinopathy Classification. Available at arXiv:2308.00525 doi: https://doi.org/10.48550/arXiv.2308.00525
Bajwa, A., Nosheen, N., Talpur, K. I., & Akram, S. (2023). A prospective study on diabetic retinopathy detection based on modify convolutional neural network using fundus images at sindh institute of ophthalmology & visual sciences. Diagnostics, 13(3), 393.
Khan, A. Q., Sun, G., Khalid, M., Farrash, M., & Bilal, A. (2024). Multi‐Deep Learning Approach With Transfer Learning for 7‐Stages Diabetic Retinopathy Classification. International Journal of Imaging Systems and Technology, 34(6), e23213.
Hacisoftaoglu, R. E., Karakaya, M., & Sallam, A. B. (2020). Deep learning frameworks for diabetic retinopathy detection with smartphone-based retinal imaging systems. Pattern recognition letters, 135, 409-417.
Khairandish, M. O., Sharma, M., Jain, V., Chatterjee, J. M., & Jhanjhi, N. Z. (2022). A hybrid CNN-SVM threshold segmentation approach for tumor detection and classification of MRI brain images. Irbm, 43(4), 290-299.
Jadhav, M.L., Shaikh, M.Z., Sardar, V.M. (2021). Automated Microaneurysms Detection in Fundus Images for Early Diagnosis of Diabetic Retinopathy. In: Bhateja, V., Satapathy, S.C., Travieso-González, C.M., Aradhya, V.N.M. (eds) Data Engineering and Intelligent Computing. Advances in Intelligent Systems and Computing, vol 1407. Springer, Singapore. https://doi.org/10.1007/978-981-16-0171-2_9
Rajamani, S., & Sasikala, S. (2023). Artificial intelligence approach for diabetic retinopathy severity detection. Informatica, 46(8).
Zhang, Q. M., Luo, J., & Cengiz, K. (2021). An optimized deep learning-based technique for grading and extraction of diabetic retinopathy severities. Informatica, 45(5).
Silva, P. S., Cavallerano, J. D., Sun, J. K., & Aiello, L. M. (2021). Effectiveness of artificial intelligence–based diabetic retinopathy screening in a primary care setting: A pilot study. JAMA Ophthalmology, 139(10), 1076- 1082. doi: 10.1001/jamaophthalmol.2021.2924
Mohanty C. et al., “Using Deep Learning Architectures for Detection and Classification of Diabetic Retinopathy,” Sensors, vol. 23, no. 12, 2023.
Alyoubi, W. L., Abulkhair, M. F., & Shalash, W. M. (2021). Diabetic retinopathy fundus image classification and lesions localization system using deep learning. Sensors, 21(11), 3704.
Yaqoob, M. K., Ali, S. F., Bilal, M., Hanif, M. S., & Al-Saggaf, U. M. (2021). ResNet based deep features and random forest classifier for diabetic retinopathy detection. Sensors, 21(11), 3883.
Zhang, G., Sun, B., Zhang, Z., Pan, J., Yang, W., & Liu, Y. (2022). Multi-model domain adaptation for diabetic retinopathy classification. Frontiers in Physiology, 13, 918929.
Khudhair, Z. N., Khdiar, A. N., El Abbadi, N. K., Mohamed, F., Saba, T., Alamri, F. S., & Rehman, A. (2023). Color to grayscale image conversion based on singular value decomposition. Ieee Access, 11, 54629-54638.
Meng, Y., Wang, C. C., & Jin, X. (2012). Flexible shape control for automatic resizing of apparel products. Computer-aided design, 44(1), 68-76.
Wang, G., Wang, Y., Bao, X., & Huang, D. (2023). Rotation has two sides: Evaluating data augmentation for deep one-class classification. In The Twelfth International Conference on Learning Representations.
Pradhan, P. K., Das, A., Kumar, A., Baruah, U., Sen, B., & Ghosal, P. (2024). SwinSight: A hierarchical vision transformer using shifted windows to leverage aerial image classification. Multimedia Tools and Applications, 83(39), 86457-86478.
Yoo, D., Kim, J., & Yoo, J. (2024). FSwin Transformer: Feature-Space Window Attention Vision Transformer for Image Classification. IEEE Access, 12, 72598-72606.
Jiang, W., Cui, H., & He, K. (2024). Class-relevant Patch Embedding Selection for Few-Shot Image Classification. arXiv preprint arXiv:2405.03722.
Lv, Y., Pan, L., Xu, K., Li, G., Zhang, W., Li, L., & Lei, L. (2025). Enhanced local multi-windows attention network for lightweight image super-resolution. Computer Vision and Image Understanding, 250, 104217.
Liang, Z., Zhao, K., Liang, G., Li, S., Wu, Y., & Zhou, Y. (2023). MAXFormer: Enhanced transformer for medical image segmentation with multi-attention and multi-scale features fusion. Knowledge-Based Systems, 280, 110987.
Lin, C., Yang, P., Wang, Q., Qiu, Z., Lv, W., & Wang, Z. (2023). Efficient and accurate compound scaling for convolutional neural networks. Neural Networks, 167, 787-797.
Li, Q., Luo, S., Tan, S., & Li, Z. (2025). SEAP: squeeze-and-excitation attention guided pruning for lightweight steganalysis networks. EURASIP Journal on Information Security, 2025(1), 24.
Zafar, A., Aamir, M., Mohd Nawi, N., Arshad, A., Riaz, S., Alruban, A., ... & Almotairi, S. (2022). A comparison of pooling methods for convolutional neural networks. Applied Sciences, 12(17), 8643.
DOI: https://doi.org/10.31449/inf.v49i31.12008
This work is licensed under a Creative Commons Attribution 3.0 License.








