A hybrid approach of metaheuristic algorithms and Group Method of Data Handling (GMDH) to Predict Source Code Testability

Abstract

Testability is a critical software quality attribute that enables developers to efficiently test and improve code throughout the development lifecycle. This study investigates the prediction of software testability using machine learning algorithms, focusing on hybrid models that integrate the Group Method of Data Handling (GMDH) with metaheuristic optimization techniques. A dataset of 16,165 software classes with 17 static source-code metrics was used, split into 80% training and 20% testing samples. GMDH parameters were optimized using three metaheuristic algorithms: Colliding Bodies Optimization (CBO), Firefly Algorithm (FA), and Ant Colony Optimization (ACO), forming hybrid predictive models. Comparative baseline regressors included SVM, Random Forest, AdaBoost, MLP, and GLM. All models were evaluated using eight metrics: RMSE, MAE, R, RAE, Std, VAF, PI, and Max Error. The results show that among baselines, GLM achieved RMSE = 0.2357, MAE = 0.1871, and R = 0.5505, while SVM, RF, AdaBoost, and MLP achieved slightly higher errors. The GMDH hybrids improved performance, with GMDH–FA achieving RMSE=0.209, R=0.671, GMDH–ACO RMSE =0.209, R = 0.673, and GMDH–CBO reaching the best overall performance with RMSE=0.208, and R=0.674. These results indicate that metaheuristic-optimized GMDH models, particularly GMDH–CBO, provide a robust and accurate approach for predicting software testability.

Authors

  • Xiaobo Wu School of Business, Lingnan Normal University Zhanjiang 524048, Guangdong, China

DOI:

https://doi.org/10.31449/inf.v50i10.10680

Downloads

Published

03/18/2026

How to Cite

Wu, X. (2026). A hybrid approach of metaheuristic algorithms and Group Method of Data Handling (GMDH) to Predict Source Code Testability. Informatica, 50(10). https://doi.org/10.31449/inf.v50i10.10680