Automated Financial Statement Auditing via YOLOv5s Object Detection and NLP-Based Semantic Analysis
Abstract
Driven by globalization and digitalization, the complexity and volume of financial statements have exploded, and the limitations of traditional auditing methods in terms of efficiency and accuracy have become increasingly prominent. At present, there are relatively few relevant studies on the combination of object detection and text analysis in financial auditing, and this paper has launched an innovative exploration in this field and proposed an intelligent financial statement audit system. The system integrates advanced YOLOv5s financial image recognition technology and natural language processing algorithms to achieve fast and accurate recognition and understanding of financial information. This study presents an integrated framework combining computer vision and natural language processing for financial report analysis, employing YOLOv5s optimized with a domain-specific dataset containing 15,000 annotated financial statement images to achieve 96.4% detection accuracy in parsing complex tabular structures. For text understanding, we implement a hybrid NLP architecture utilizing BERT for semantic role labeling and BiLSTM with attention mechanisms to extract financial indicators and risk factors, trained on a corpus of 50,000 financial reports with 85-15 train-test split. In order to ensure the scientific and reliable research, the experimental results show that the intelligent audit system has a recognition accuracy of 98% when processing large-scale financial statement data, which is 15% higher than that of traditional methods. The system is 3 times faster, significantly shortening the audit cycle and reducing the audit cost. At the same time, the system can also automatically detect abnormal data, assist auditors to quickly locate potential financial risks, and provide a strong guarantee for decision support.DOI:
https://doi.org/10.31449/inf.v49i11.8999Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







