Impact of Data Balancing During Training for Best Predictions
To protect the middle class from over-indebtedness, banking institutions need to implement a flexible analytic-based evaluation method to improve the banking process by detecting customers who are likely to have difficulty in managing their debt. In this paper, we test and evaluate a large variety of data balancing methods on selected machine learning algorithms (MLAs) to overcome the effects of imbalanced data and show their impact on the training step to predict credit risk. Our objective is to deal with data unbalance to achieve the best predictions. We investigated the performance of these methods by different learners when classification models are trained using MLAs.
This work is licensed under a Creative Commons Attribution 3.0 License.