Fusion CNN-Transformer Model for Target Counting in Complex Scenarios

Xingyuan He; Ruiying Wang; Ting Cao; Weiyu Liang; Yimin Fan

doi:10.31449/inf.v49i12.7315

Fusion CNN-Transformer Model for Target Counting in Complex Scenarios

Abstract

To overcome the shortcomings of traditional manual counting methods, which are labor-intensive, resource-consuming, and inefficient, this study introduces a computer-based counting model. This model integrates convolutional neural networks (CNNs) with Transformer networks to efficiently recognize and count specific target objects in large-scale data scenarios. This approach leverages CNNs for local feature extraction and incorporates Transformer networks to capture long-range global information, achieving a synergistic effect. The methodology includes key steps such as “CNN for feature extraction and Transformer for global attention.” The experiment outcomes show that the model has an average absolute error of 10.13, a root mean square error of 12.08, an average counting accuracy of 98.6%, a peak signal-to-noise ratio of 23.75, a structural similarity of 0.933, a coefficient of determination of 0.901, an average counting time of about 6.58ms per image, and a parameter count of 3.21 in target counting. It can also recognize and respond well to high complexity scenes while maintaining high accuracy. Compared to the CNN model, the research model reduces the error rate by 13.4%, indicating that the fusion of CNN and Transformer networks is effective in object counting for computer vision tasks. This result indicates that the model integrating convolutional neural networks and fully self attention networks can be well applied to computer recognition and object counting.

Authors

Xingyuan He
Ruiying Wang
Ting Cao
Weiyu Liang
Yimin Fan

DOI:

https://doi.org/10.31449/inf.v49i12.7315

Downloads

Published

02/23/2025

How to Cite

He, X., Wang, R., Cao, T., Liang, W., & Fan, Y. (2025). Fusion CNN-Transformer Model for Target Counting in Complex Scenarios. Informatica, 49(12). https://doi.org/10.31449/inf.v49i12.7315

Download Citation

Issue

Vol. 49 No. 12 (2025): Online-only issue

Section

Online-only

License

Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.

All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.

Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.

Fusion CNN-Transformer Model for Target Counting in Complex Scenarios

Abstract

Authors

DOI:

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information