Enterprise Financial Fraud Detection using GA-BPNN, Random Forest, and SVM with Multi-Modal Features
Abstract
This paper selected 22 indicators related to financial fraud, taking into account both financial and non-financial indicators from a machine learning perspective, and screened them through the IV value and Spearman correlation coefficient. Then, three algorithms, random forest (RF), support vector machine (SVM), and genetic algorithm-back-propagation neural network (GA-BPNN), were introduced, and experiments were carried out with the data of 784 fraudulent enterprises and 784 non-fraudulent enterprises sourced from the China Stock Market & Accounting Research Database as samples. The results indicated that using both financial and non-financial indicators yielded better identification results compared to using only financial indicators: the recall rate of the RF algorithm increased from 0.5762 to 0.6295, and the area under the receiver operating characteristic curve (AUC) increased from 0.6331 to 0.6874. The GA-BPNN algorithm exhibited the best performance in recognizing financial fraud behaviors, achieving a recall rate of 0.8256 and an AUC of 0.8537, which were the highest among the different algorithms. The shareholding ratio of the largest shareholder was the most important factor in identifying financial fraud, followed by the inventory turnover rate, which requires particular attention. The results demonstrate the usability of the GA-BPNN algorithm and the established indicator system, which can be applied in actual financial scenarios.DOI:
https://doi.org/10.31449/inf.v50i9.12142Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







