MAGT: Multi-scale Attention Graph Transformer with Local Context Enhancement for CT Image Analysis

Youchun Qiu

doi:10.31449/inf.v49i31.8111

MAGT: Multi-scale Attention Graph Transformer with Local Context Enhancement for CT Image Analysis

Youchun Qiu

Abstract

Computed tomography (CT) has become an important tool in cancer screening and diagnosis, where accurate image analysis can assist early detection and treatment planning. While deep learning methods have shown progress in CT image analysis, effectively capturing both local features and global context remains challenging. This paper presents MAGT, a Multi-scale Attention Graph Transformer (MAGT) framework that combines graph-based geometric modeling with transformer architectures for CT image analysis. The MAGT framework includes two main components: a Multi-Head Feature Aggregator (MHFA) that integrates features from different scales while preserving their characteristics, and a Local Context Enhancement Block (LCEB) that strengthens the capture of spatial information. This design enables MAGT to process CT images by considering both lesion characteristics and their surrounding anatomical context, similar to clinical examination procedures. The framework uses graph structures to represent spatial relationships in CT images while incorporating transformer mechanisms into model feature dependencies. Experiments conducted on four public datasets (LIDC-IDRI, LUNGx, LUNA16, and DeepLesion) demonstrate the effectiveness of MAGT; for example, on the LIDC-IDRI dataset, MAGT achieved an accuracy of 91.5% and an F1-score of 91.3%, outperforming a strong baseline (Swin-T) by 2.1% in both metrics. Ablation studies verify the contributions of different components within the framework. The results indicate that MAGT offers a practical approach for CT image analysis, potentially supporting cancer detection and diagnosis in clinical applications

Full Text:

PDF

References

F. Biemar, M. Foti. Global progress against cancer-challenges and opportunities. Cancer biology & medicine, 10(4): 183, 2013.

R. A. Smith, A. C. von Eschenbach, R. Wender, et al. American Cancer Society guidelines for the early detection of cancer: update of early detection guidelines for prostate, colorectal, and endometrial cancers: Also: update 2001-testing for early lung cancer detection. CA: a cancer journal for clinicians, 51(1): 38-75, 2001.

C. I. Lee, A. H. Haims, E. P. Monico, et al. Diagnostic CT scans: assessment of patient, physician, and radiologist awareness of radiation dose and possible risks. Radiology, 231(2): 393-398, 2004.

D. S. Gierada, W. C. Black, C. Chiles, et al. Low-dose CT screening for lung cancer: evidence from 2 decades of study. Radiology: Imaging Cancer, 2(2): e190058, 2020.

S. H. Yoon, J. M. Goo, S. M. Lee, et al. Positron emission tomography/magnetic resonance imaging evaluation of lung cancer: current status and future prospects. Journal of thoracic imaging, 29(1): 4-16, 2014.

H. Cao, H. Liu, E. Song, et al. A two-stage convolutional neural networks for lung nodule detection. IEEE journal of biomedical and health informatics, 24(7): 2006-2015, 2020.

Sakshiwala, M. P. Singh. A new framework for multi-scale CNN-based malignancy classification of pulmonary lung nodules. Journal of Ambient Intelligence and Humanized Computing, 14(5): 4675-4683, 2023.

Z. Niu, G. Zhong, H. Yu. A review on the attention mechanism of deep learning. Neurocomputing, 452: 48-62, 2021.

N. Zeng, P. Wu, Z. Wang, et al. A small-sized object detection oriented multi-scale feature fusion approach with application to defect detection. IEEE Transactions on Instrumentation and Measurement, 71: 1-14, 2022.

B. R. Mitchell. Overview of advanced neural network architectures. Artificial Intelligence in Pathology. Elsevier, 43-58, 2025.

L. Lan, P. Cai, L. Jiang, et al. BRAU-Net++: U-Shaped Hybrid CNN-Transformer Network for Medical Image Segmentation. arXiv preprint arXiv:2401.00722, 2024.

R. Liu, H. Deng, Y. Huang, et al. Fuseformer: Fusing fine-grained information in transformers for video inpainting. Proceedings of the IEEE/CVF international conference on computer vision, 14040-14049, 2021.

J. Ma, Y. Bai, B. Zhong, et al. Visualizing and understanding patch interactions in vision transformer. IEEE Transactions on Neural Networks and Learning Systems, 35(10): 13671-13680, 2023.

H. Xu, Y. Wu. G2ViT: Graph Neural Network-Guided Vision Transformer Enhanced Network for retinal vessel and coronary angiograph segmentation. Neural Networks, 176: 106356, 2024.

H. Fan, B. Xiong, K. Mangalam, et al. Multiscale vision transformers. Proceedings of the IEEE/CVF international conference on computer vision, 6824-6835, 2021.

J. Wensel, H. Ullah, A. Munir. Vit-ret: Vision and recurrent transformer neural networks for human activity recognition in videos. IEEE Access, 11: 72227-72249, 2023.

M. Aiello, C. Cavaliere, A. D’Albore, et al. The challenges of diagnostic imaging in the era of big data. Journal of clinical medicine, 8(3): 316, 2019.

J. Song, Z. Liu, W. Zhong, et al. Non-small cell lung cancer: quantitative phenotypic analysis of CT images as a potential marker of prognosis. Scientific reports, 6(1): 38282, 2016.

G. Wu, A. Jochems, T. Refaee, et al. Structural and functional radiomics for lung cancer. European Journal of Nuclear Medicine and Molecular Imaging, 48: 3961-3974, 2021.

J. D. Shur, S. J. Doran, S. Kumar, et al. Radiomics in oncology: a practical guide. Radiographics, 41(6): 1717-1732, 2021.

M. Nasir, M. Attique Khan, M. Sharif, et al. An improved strategy for skin lesion detection and classification using uniform segmentation and feature selection-based approach. Microscopy research and technique, 81(6): 528-543, 2018.

Y. Fu, Y. Lei, T. Wang, et al. A review of deep learning-based methods for medical image multi-organ segmentation. Physica Medica, 85: 107-122, 2021.

J. Venugopalan, L. Tong, H. R. Hassanzadeh, et al. Multimodal deep learning models for early detection of Alzheimer’s disease stage. Scientific reports, 11(1): 3254, 2021.

P. Dhiman, J. Ma, C. L. Andaur Navarro, et al. Methodological conduct of prognostic prediction models developed using machine learning in oncology: a systematic review. BMC medical research methodology, 22(1): 101, 2022.

Z, Cao, R, Li, X. Yang, et al. Multi-scale detection of pulmonary nodules by integrating attention mechanism. Scientific Reports, 13(1): 5517, 2023.

M. Yang, H. Yu, H. Feng, et al. Enhancing the differential diagnosis of small pulmonary nodules: a comprehensive model integrating plasma methylation, protein biomarkers, and LDCT imaging features. Journal of Translational Medicine, 22(1): 984, 2024.

L, Qiu, L, Zhao, R, Hou, et al. Hierarchical multimodal fusion framework based on noisy label learning and attention mechanism for cancer classification with pathology and genomic features. Computerized Medical Imaging and Graphics, 104: 102176, 2023.

C. W. Wang, Y. C. Lee, C. C. Chang, et al. A weakly supervised deep learning method for guiding ovarian cancer treatment and identifying an effective biomarker. Cancers, 14(7): 1651, 2022.

Z. Li, Y. Jiang, M. Lu, et al. Survival prediction via hierarchical multimodal co-attention transformer: A computational histology-radiology solution. IEEE Transactions on Medical Imaging, 42(9): 2678-2689, 2023.

M. Song, S. Li, H. Wang, et al. MRI radiomics independent of clinical baseline characteristics and neoadjuvant treatment modalities predicts response to neoadjuvant therapy in rectal cancer. British Journal of Cancer, 127(2): 249-257, 2022.

Y. Su, D. Li, X. Chen. Lung nodule detection based on faster R-CNN framework. Computer Methods and Programs in Biomedicine, 200: 105866, 2021.

Y. Chen, C. Zheng, F. Hu, et al. Efficient two-step liver and tumour segmentation on abdominal CT via deep learning and a conditional random field. Computers in Biology and Medicine, 150: 106076, 2022.

X. Xie, X. Pan, F. Shao, et al. Mci-net: multi-scale context integrated network for liver ct image segmentation. Computers and Electrical Engineering, 101: 108085, 2022.

Y. George, B. J. Antony, H. Ishikawa, et al. Attention-guided 3D-CNN framework for glaucoma detection and structural-functional association using volumetric images. IEEE journal of biomedical and health informatics, 24(12): 3421-3430, 2020.

S. Baul, K. T. Ahmed, J. Filipek, et al. omicsGAT: Graph attention network for cancer subtype analyses. International Journal of Molecular Sciences, 23(18): 10220, 2022.

H. Wang, G. Huang, Z. Zhao, et al. Ccf-gnn: A unified model aggregating appearance, microenvironment, and topology for pathology image classification. IEEE Transactions on Medical Imaging, 42(11): 3179-3193, 2023.

J. Lian, J. Liu, S. Zhang, et al. A Structure-Aware Relation Network for Thoracic Diseases Detection and Segmentation. IEEE Transactions on Medical Imaging, 40(8): 2042-2052, 2021.

S. Ding, J. Li, J. Wang, et al. Multi-scale efficient graph-transformer for whole slide image classification. IEEE Journal of Biomedical and Health Informatics, 27(12): 5926-5936, 2023.

M. Liu, Y. Liu, P. Xu, et al. Exploiting Geometric Features via Hierarchical Graph Pyramid Transformer for Cancer Diagnosis using Histopathological Images. IEEE Transactions on Medical Imaging, 43(8): 2888-2900, 2024.

S. G. Armato 3rd, G. McLennan, L. Bidaut, et al. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38(2): 915-931, 2011.

S. G. Armato III, K. Drukker, F. Li, et al. LUNGx Challenge for computerized lung nodule classification. Journal of Medical Imaging, 3(4): 044506-044506, 2016.

A. A. A. Setio, A. Traverso, T. De Bel, et al. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Medical image analysis, 42: 1-13, 2017.

K. Yan, X. Wang, L. Lu, et al. DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. Journal of medical imaging, 5(3): 036501-036501, 2018.

D. Wu, Y. Ying, M. Zhou, et al. Improved ResNet-50 deep learning algorithm for identifying chicken gender. Computers and Electronics in Agriculture, 205: 107622, 2023.

A. Dosovitskiy. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

Y. Dai, Y. Gao, F. Liu. Transmed: Transformers advance multi-modal medical image classification. Diagnostics, 11(8): 1384, 2021.

A. Hatamizadeh, V. Nath, Y. Tang, et al. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. International MICCAI brainlesion workshop. Cham: Springer International Publishing, 272-284, 2021.

DOI: https://doi.org/10.31449/inf.v49i31.8111

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me