Fusion CNN-Transformer Model for Target Counting in Complex Scenarios
Abstract
To overcome the shortcomings of traditional manual counting methods, which are labor-intensive, resource-consuming, and inefficient, this study introduces a computer-based counting model. This model integrates convolutional neural networks (CNNs) with Transformer networks to efficiently recognize and count specific target objects in large-scale data scenarios. This approach leverages CNNs for local feature extraction and incorporates Transformer networks to capture long-range global information, achieving a synergistic effect. The methodology includes key steps such as “CNN for feature extraction and Transformer for global attention.” The experiment outcomes show that the model has an average absolute error of 10.13, a root mean square error of 12.08, an average counting accuracy of 98.6%, a peak signal-to-noise ratio of 23.75, a structural similarity of 0.933, a coefficient of determination of 0.901, an average counting time of about 6.58ms per image, and a parameter count of 3.21 in target counting. It can also recognize and respond well to high complexity scenes while maintaining high accuracy. Compared to the CNN model, the research model reduces the error rate by 13.4%, indicating that the fusion of CNN and Transformer networks is effective in object counting for computer vision tasks. This result indicates that the model integrating convolutional neural networks and fully self attention networks can be well applied to computer recognition and object counting.DOI:
https://doi.org/10.31449/inf.v49i12.7315Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







