Early Warning of Financial Crises in Manufacturing Using SMOTE-Tomek Random Forest and Sentiment-Enhanced Indicators
Abstract
In order to improve the accuracy of financial crisis warning in the manufacturing industry and solve the problems of single indicators and insufficient ability to handle imbalanced data in traditional models, a warning system integrating traditional financial indicators and text big data indicators has been studied and constructed. The synthetic minority oversampling technique Tomek link random forest (SMOTE-Tomek-RF) model for early warning is adopted. Moreover, using 21 manufacturing enterprises listed on the Shanghai Stock Exchange A-shares as samples and based on 22 warning indicators, core variables are selected through random forest (RF) feature selection to compare the warning performance of RF, SMOTE-RF, single decision tree (DT), and the proposed SMOTE-Tomek-RF model. The results showed that the importance scores of emotional inclination and popularity were 0.052 and 0.047, respectively. Both scores were higher than the threshold and were ranked high, effectively supplementing the information. The predictive model proposed by the research had a subject area under the working curve (AUC) of 0.968, an F1 score of 84.97%, and a G-Mean of 90.11%. The AUC of the traditional RF model, SMOTE-RF model, and DT model were only 0.934, 0.953, and 0.943, respectively. In addition, the prediction accuracy for healthy and crisis firms after combining text big data amounted to 100% and 92.86%, respectively. In summary, the prediction model can effectively deal with the data imbalance problem and improve the precision of early warning. This method provides a reliable method for financial crisis early warning in manufacturing industry, which is of great significance for enterprise risk control and investor decision-making.
Full Text:
PDFDOI: https://doi.org/10.31449/inf.v49i32.11034
This work is licensed under a Creative Commons Attribution 3.0 License.








