Bank Credit Default Risk Assessment Model Based on Federated Learning
Abstract
This paper presents a privacy-compliant credit default risk model employing Vertical Federated Learning (VFL), XGBoost, Knowledge Distillation (KD), and Temperature Scaling (TS) techniques. The aim is to address the challenge of maintaining privacy while improving model performance in federated learning environments, particularly focusing on bank credit risk prediction. The model is tested on the CRMS-2024 dataset. Compared to local models and existing federated learning methods, the proposed model shows significant improvements across multiple metrics. The proposed method achieves an AUC of 0.804, 0.063 higher than the local model, and reduces the expected calibration error (ECE) to 0.024. The model also demonstrates excellent fairness performance. With temperature scaling, the demographic equilibrium difference (DPD) is 0.021, and the chance equality gap is 0.028. Statistical significance is evaluated using the DeLong test for AUC and the BCa bootstrap method for Brier scores, and Holm correction is applied for multiple comparisons. The proposed method remains robust to noise, data imbalance, and Byzantine attacks, demonstrating the performance and calibration improvements of KD and TS in non-IID federated environments. This paper also explores future improvement paths, including cross-domain validation for datasets such as LendingClub, and integrates regulatory APIs to achieve compliance.References
C.-M. Lee, J.-D. Fernández, S.-P. Menci, Federated Learning for Credit Risk Assessment, Proc. HICSS-56, 2023.
Z. Wang et al., A novel federated learning approach with knowledge distillation for credit scoring, Decision Support Systems, 2024.
H. He et al., A privacy-preserving decentralized credit scoring method based on vertical federated learning, Decision Support Systems, 2023.
A. Oualid, Y. Maleh, L. Moumoun, Federated Learning Techniques Applied to Credit Risk Management: A Systematic Literature Review, EDPACS, 2023.
D. Chai et al., A Survey for Federated Learning Evaluations: Goals and Measures, IEEE TKDE, 2024.
F. Futami et al., Information-theoretic Generalization Analysis for Expected Calibration Error, NeurIPS 2024.
T. Silva Filho et al., Classifier calibration: a survey on how to assess and improve predicted class probabilities, Machine Learning, 2023.
L. Hu et al., Calibration Error for Decision Making, arXiv:2404.13503, 2024.
M. Mansouri et al., Secure Aggregation Based on Cryptographic Schemes for Federated Learning, PoPETs, 2023.
D. Morales et al., Private Set Intersection: A Systematic Literature Review, Computer Science Review, 2023.
A. Chakraborti et al., Distance-Aware Private Set Intersection, USENIX Security, 2023.
H.-K. Tayyeh et al., A Differential Privacy Approach in Federated Learning, Computers, 2024.
J. Wang et al., Local differential privacy federated learning based on clustering hierarchical aggregation, Computer Networks, 2024.
S. Lu et al., Top-k sparsification with secure aggregation for privacy-preserving federated learning, Computers & Security, 2023.
R. Behnia et al., Efficient Secure Aggregation for Privacy-Preserving Federated Machine Learning (e-SeaFL), arXiv:2304.03841, 2023.
NIST, Data Distribution in Privacy-Preserving Federated Learning, 2024.
R. Li et al., A dynamic receptive field and improved feature fusion framework in FL for financial credit risk, Scientific Reports, 2024.
H. Xu et al., DPFedBank: Privacy-Preserving Federated Learning for Banking with Local DP, arXiv:2410.13753, 2024.
A. Khan et al., Vertical Federated Learning: A Structured Literature Review, Knowl. Inf. Syst., 2025.
W.-H. Chen et al., Tree-based Models for Vertical Federated Learning: A Survey, ACM Comput. Surv, 2025.
S. Houshmand et al., Credit Risk Prediction: An Application of Federated Learning, Journal of Information Systems and Telecommunication, 2025.
Y. Li et al., The Effects of Data Imbalance Under a Federated Learning Setting for Credit Risk Assessment, arXiv:2401.07234, 2024.
DOI:
https://doi.org/10.31449/inf.v50i6.12533Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







