Hierarchical Local-Global Attention in a Multi-Scale Transformer Network for Enhanced Image Denoising
Abstract
Image denoising aims to remove noise from contaminated images. With the increasing complexity of noise in real-world scenarios, current denoising methods struggle to effectively address this challenge. This paper proposes a Multi-Scale Transformer Network (MST-Net) for image denoising. First, we introduce a novel multi-scale patch embedding strategy. In this process, noisy images are divided into patches of varying scales to capture multi-scale features. Second, we propose a Hierarchical Local-Global Attention (HLGA) mechanism in MST-Net. The proposed HLGA initially produces local attention within each scale, which is then integrated with global attention to generate the final attention map. Consequently, our MSTNet can capture long-range dependencies at multiple scales, effectively reducing complex noise in the denoising process. Additionally, we introduce a cross-scale feature fusion module to enhance information integration across different scales. Extensive experiments on standard benchmarks, including Set12, BSD68, CBSD68, and Urban100 datasets, demonstrate that the proposed MST-Net achieves state-of-theart performance. Specifically, MST-Net outperforms existing methods by up to 0.17 dB PSNR improvement on Set12 and 0.15 dB on BSD68 at higher noise levels (σ=75). Moreover, on color image datasets, MST-Net shows consistent enhancements, achieving up to 0.13 dB PSNR gain on Urban100. These results highlight the effectiveness of MST-Net in handling diverse noise patterns while maintaining a balance between computational efficiency and denoising performance. The proposed approach offers a practical solution for real-world image denoising applications.References
A. Buades, B. Coll, J.M. Morel. A non-local algorithm for image denoising. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, 2005, pp. 60-65.
V. Karnati, M. Uliyar, S. Dey. Fast Non-Local algorithm for image denoising. 2009 16th IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 2009, pp. 3873-3876.
C. Karam, K. Hirakawa. Monte-Carlo Acceleration of Bilateral Filter and Non-Local Means. IEEE Transactions on Image Processing, 27(3): 1462-1474, 2018.
M. P. Nguyen, S. Y. Chun. Bounded Self-Weights Estimation Method for Non-Local Means Image Denoising Using Minimax Estimators. IEEE Transactions on Image Processing, 26(4): 1637-1649, 2017.
J. R. Liao, C. Y. Chan. Efficient Implementation of Non-Local Means Image Denoising Algorithm. 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), Osaka, Japan, 2019, pp. 566-567.
C. Tian, Y. Xu, Z. Li, et al. Attention-guided CNN for image denoising. Neural Networks, 124: 5596-5610.
K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Transactions on Image Processing, 26(7): 3142-3155, 2017.
Chen, J., Lu, Y., Xu, Z. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv preprint arXiv:2102.04306, 2021.
Q. Shi, X. Tang, T. Yang, R. Liu, L. Zhang. Hyperspectral Image Denoising Using a 3-D Attention Denoising Network. IEEE Transactions on Geoscience and Remote Sensing, 59(12): 10348-10363, 2021.
L. I. Rudin, S. Osher, E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 60(1-4): 259-268, 1992.
K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian. Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering. IEEE Transactions on Image Processing, 16(8): 2080-2095, 2007.
K. Zhang, W. Zuo, L. Zhang. FFDNet: Toward a Fast and Flexible Solution for CNN-Based Image Denoising. IEEE Transactions on Image Processing, 27(9): 4608-4622, 2018.
P. Liu, H. Zhang, K. Zhang, L. Lin, W. Zuo. Multi-level Wavelet-CNN for Image Restoration. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1873-1882, 2018.
S. Guo, Z. Yan, K. Zhang, W. Zuo, L. Zhang. Toward Convolutional Blind Denoising of Real Photographs. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1712-1722, 2019.
D. Liu, B. Wen, Y. Fan, C. C. Loy, T. S. Huang. Non-Local Recurrent Network for Image Restoration. Advances in Neural Information Processing Systems, 31, 2018.
Y. Zhang, K. Li, K. Li, B. Zhong, Y. Fu. Residual Non-local Attention Networks for Image Restoration. International Conference on Learning Representations, 2019.
Z. Yue, H. Yong, Q. Zhao, D. Meng, L. Zhang. Variational Denoising Network: Toward Blind Noise Modeling and Removal. Advances in Neural Information Processing Systems, 32, 2019.
A. Dosovitskiy et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR 2021.
H. Chen et al. Pre-Trained Image Processing Transformer. CVPR 2021.
Z. Wang et al. Uformer: A General U-Shaped Transformer for Image Restoration. CVPR 2022.
J. Liang et al. SwinIR: Image Restoration Using Swin Transformer. ICCV 2021.
S. Zamir et al. Restormer: Efficient Transformer for High-Resolution Image Restoration. CVPR 2022.
H. Valanarasu et al. TransWeather: Transformer-Based Restoration of Images Degraded by Adverse Weather Conditions. CVPR 2022.
K. Zhang et al. Designing a Practical Degradation Model for Deep Blind Image Super-Resolution. ICCV 2021.
X. Chen et al. TransCNN: Transformer in Convolutional Neural Network for Image Restoration. arXiv:2211.08889, 2022.
Z. Tu et al. MAXIM: Multi-Axis MLP for Image Processing. CVPR 2022.
L. Chen et al. HINet: Half Instance Normalization Network for Image Restoration. CVPR 2021.
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations (ICLR), 2015.
K. He, X. Zhang, S. Ren, J. Sun. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026-1034.
K. Zhang, Y. Li, W. Zuo, L. Zhang, L. Van Gool, R. Timofte. Plug-and-Play Image Restoration with Deep Denoiser Prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10): 6360-6376, 2022.
Z. Fan et al. SCUNet: Parallel Squeeze-and-Correlation Networks for Image Denoising. ICCV, 2023.
D. Martin et al. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. ICCV, 2001.
J. B. Huang et al. Single Image Super-Resolution from Transformed Self-Exemplars. CVPR, 2015.
DOI:
https://doi.org/10.31449/inf.v49i6.6861Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







