RK-W-Stacking: A Hybrid Model Combining Entropy-Weighted RFM-K-Means Clustering and Weighted Stacking Ensemble for User Consumption Prediction
Abstract
Understanding user consumption characteristics helps predict consumer behavior and design appropriate marketing strategies, which promotes spending and supports market economic circulation. However, existing models have a low recognition rate for user consumption characteristics and limited accuracy in consumer behavior prediction. To this end, a user consumption behavior prediction model was constructed in the research. The main innovation of the research method lies in the weighted reconstruction of the K-means distance calculation logic by feature variance to enhance the robustness of the non-convex data set class. The index weights are dynamically calculated through the entropy weight method to optimize the user value assessment. A Stacking ensemble framework based on error rate weighting is designed to integrate the advantages of XGBoost and random forest-based classifiers. Experimental results showed 96.3% feature recognition accuracy, 0.25% loss rate, and 0.91 AUC, demonstrating strong classification ability. Additionally, long-term consumption prediction experiments show that the proposed model reaches a maximum memory usage of 1170 MB, a maximum response time of 120 ms, while maintains an annual prediction accuracy of over 94.2%. These results indicate that the proposed model achieves high accuracy and adaptability in extracting consumption characteristics and forecasting future consumer behavior. Furthermore, the system exhibits greater stability and faster response times than comparable prediction models, providing new insights for merchants to optimize business strategies and anticipate market trends.
References
Ali M, Ullah S, Ahmad M S, Cheok M Y, Alenezi H. Assessing the impact of green consumption behavior and green purchase intention among millennials toward sustainable environment. Environmental Science and Pollution Research, 2023, 30(9): 23335-23347. https://doi.org/10.1007/s11356-022-23811-1
Vighnesh N V, Balachandra P, Chandrashekar D, Sawang S. How cultural values influence sustainable consumption behavior? An empirical investigation in a non‐Western context. Sustainable Development, 2023, 31(2): 990-1007. https://doi.org/10.1002/sd.2436
Alsaad A, Elrehail H, Saif‐Alyousfi A Y H. The interaction among religiosity, moral intensity and moral certainty in predicting ethical consumption: A study of Muslim consumers. International Journal of Consumer Studies, 2022, 46(2): 406-418. https://doi.org/10.1111/ijcs.12688
Liu H, Chen J, Dy J, Fu Y. Transforming complex problems into K-means solutions. IEEE transactions on pattern analysis and machine intelligence, 2023, 45(7): 9149-9168. https://doi.org/10.1109/TPAMI.2023.3237667
Ture B A, Akbulut A, Zaim A H, Catal C. Stacking-based ensemble learning for remaining useful life estimation. Soft Computing, 2024, 28(2): 1337-1349. https://doi.org/10.1007/s00500-023-08322-6
Nie F, Li Z, Wang R, Li X. An effective and efficient algorithm for K-means clustering with new formulation. IEEE Transactions on Knowledge and Data Engineering, 2022, 35(4): 3433-3443. https://doi.org/10.1109/TKDE.2022.3155450
Brusco M, Steinley D, Watts A L. Improving the Walktrap algorithm using k-means clustering. Multivariate Behavioral Research, 2024, 59(2): 266-288. https://doi.org/10.1080/00273171.2023.2254767
Chen Y T, Witten D M. Selective inference for k-means clustering. Journal of Machine Learning Research, 2023, 24(152): 1-41. https://doi.org/10.48550/arXiv.2203.15267
Meharie M G, Mengesha W J, Gariy Z A, Mutuku R N. Application of stacking ensemble machine learning algorithm in predicting the cost of highway construction projects. Engineering, Construction and Architectural Management, 2022, 29(7): 2836-2853. https://doi.org/10.1108/ECAM-02-2020-0128
Xue Y, Chen C, Słowik A. Neural architecture search based on a multi-objective evolutionary algorithm with probability stack. IEEE Transactions on Evolutionary Computation, 2023, 27(4): 778-786. https://doi.org/10.1109/TEVC.2023.3252612
Hou S, Liu Y, Yang Q. Real-time prediction of rock mass classification based on TBM operation big data and stacking technique of ensemble learning. Journal of Rock Mechanics and Geotechnical Engineering, 2022, 14(1): 123-143. https://doi.org/10.1016/j.jrmge.2021.05.004
Tang C, Tang Y, Zeng Z, Zhang L, Xiang S. Customer characteristics analysis method based on the selection of electricity consumption characteristics and behavioral portraits of different groups of people. Journal of Intelligent & Fuzzy Systems, 2023, 44(3): 4273-4283. https://doi.org/10.3233/JIFS-220615
Yang X, Huang P, An L, Feng P, Wei B, He P. A growing model-based OCSVM for abnormal student activity detection from daily campus consumption. New Generation Computing, 2022, 40(4): 915-933. https://doi.org/10.1007/s00354-022-00193-z
Yang J, Li J, Cao Y. Analysis of peer effects on consumption in rural China based on social networks. Applied Economics, 2023, 55(6): 617-635. https://doi.org/10.1080/00036846.2022.2092592
Saif S, Zameer H, Wang Y, Ali Q. The effect of retailer CSR and consumer environmental responsibility on green consumption behaviors: mediation of environmental concern and customer trust. Marketing Intelligence & Planning, 2024, 42(1): 149-167. https://doi.org/10.1108/MIP-04-2023-0181
Khan A A, Meraj M, Asif H M. A multi-dimensional exploration of university students’ sustainable consumption and environmental awareness in Pakistan. Journal of Management Info, 2024, 11(1): 102-122. https://doi.org/10.31580/jmi.v11i1.3005
Chen W, Wang Q. Application of Improved k-means Algorithm in E-commerce Data Processing. Informatica, 2024, 48(11): 147-166. https://doi.org/10.31449/inf.v48i11.5996
Malik M S I, Nawaz A. SEHP: stacking-based ensemble learning on novel features for review helpfulness prediction. Knowledge and Information Systems, 2024, 66(1): 653-679. https://doi.org/10.1007/s10115-023-02020-3
Du B, Wang M, Zhang J, Chen Y, Wang T. Urban flood prediction based on PCSWMM and stacking integrated learning model. Natural Hazards, 2025, 121(2): 1971-1995. https://doi.org/10.1007/s11069-024-06893-7
Liang C. Application of maximum entropy fuzzy clustering algorithm with soft computing in migration anomaly detection. Informatica, 2024, 48(17): 171-182. https://doi.org/10.31449/inf.v48i17.6537
DOI: https://doi.org/10.31449/inf.v49i32.9086

This work is licensed under a Creative Commons Attribution 3.0 License.