Deep Reinforcement Learning with Convolutional Neural Networks for Optimizing Supply Chain Inventory Management
Abstract
With the increasing complexity of global supply chains and frequent fluctuations in market demand, inventory management faces severe challenges, and traditional inventory control methods are difficult to meet the needs. This paper constructs an inventory control model that combines deep reinforcement learning (DRL) with convolutional neural networks (CNN). By defining the state space, action space and reward function, the Q-learning algorithm is used to optimize inventory decisions. At the same time, CNN is used to extract historical demand data features to improve the accuracy of demand forecasting. This study uses historical sales data from a medium-sized clothing retailer as a dataset, which contains sales, inventory, and replenishment records for the past 4 years. The model was trained for 500 episodes. This model was compared with the economic order quantity (EOQ) model, the periodic ordering model, and the simple moving average forecasting model as the benchmark model. The model's demand forecast error of 3.2% was measured on independent actual test data. The experimental results show that the model has a demand forecast error of only 3.2%, the total inventory cost is 14,500 yuan, the cost reduction rate is - 22%, the average inventory turnover rate is 10.5 times, and the average out-of-stock rate is only 2.1%. All indicators are significantly better than the economic order quantity (EOQ) model and the periodic ordering model. The study proves that the model can effectively cope with demand fluctuations and uncertainties, optimize inventory management, and provide a new and effective method for supply chain inventory control.
Full Text:
PDFReferences
Zha WD, Wu ZY, Tan JX, Chen YM, Fu YP, Xu ZT. Integrated pricing and inventory decisions for product quality-driven extended warranty services. Sustainability. 2024; 16(20). DOI: 10.3390/su16208769
Sbai N, Berrado A. Simulation-based approach for multi-echelon inventory system selection: case of distribution systems. Processes. 2023; 11(3). DOI: 10.3390/pr11030796
Oh SC, Min HK, Ahn YH. Inventory risk pooling strategy for the food distribution network in Korea. European Journal of Industrial Engineering. 2021; 15(4):439-62. DOI: 10.1504/ejie.2021.116131
Xu GT, Kang K, Lu MY. An omnichannel retailing operation for solving joint inventory replenishment control and dynamic pricing problems from the perspective of customer experience. IEEE Access. 2023; 11:14859-75. DOI: 10.1109/access.2023.3244400
Chen D, Feng HY, Huang Y, Tan M, Chen QY, Wei XS. Robust control of bullwhip effect for supply chain system with time-varying delay on basis of discrete-time approach. IEEE Access. 2023; 11:61049-58. DOI: 10.1109/access.2023.3286314
Rolf B, Jackson I, Müller M, Lang S, Reggelin T, Ivanov D. A review on reinforcement learning algorithms and applications in supply chain management. International Journal of Production Research. 2023; 61(20):7151-79. DOI: 10.1080/00207543.2022.2140221
Xia YX, Li CC. Robust control strategy for an uncertain dual-channel closed-loop supply chain with process innovation for remanufacturing. IEEE Access. 2023; 11:97852-65. DOI: 10.1109/access.2023.3312540
Darmawan A, Wong H, Thorstenson A. Supply chain network design with coordinated inventory control. Transportation Research Part E-Logistics and Transportation Review. 2021; 145. DOI: 10.1016/j.tre.2020.102168
Thomas AV, Mahanty B. Dynamic assessment of control system designs of information shared supply chain network experiencing supplier disruption. Operational Research. 2021; 21(1):425-51. DOI: 10.1007/s12351-018-0435-9
Li SS, He Y, Minner S. Dynamic compensation and contingent sourcing strategies for supply disruption. International Journal of Production Research. 2021; 59(5):1511-33. DOI: 10.1080/00207543.2020.1840643
Nya DN, Abouaissa H. A robust inventory management in dynamic supply chains using an adaptive model-free control. Computers & Chemical Engineering. 2023; 179. DOI: 10.1016/j.compchemeng.2023.108434
Jiang YC, Cao JX, Zhu HJ. Research on inventory control and pricing decisions in the supply chain of fresh agricultural products under the advertisement delay effect. IEEE Access. 2024; 12:197468-87. DOI: 10.1109/access.2024.3522137
Zhou YL, Li H, Hu SQ, Yu XZ. Two-stage supply chain inventory management based on system dynamics model for reducing bullwhip effect of sulfur product. Annals of Operations Research. 2024; 337(SUPPL 1):5-. DOI: 10.1007/s10479-022-04815-z
Qasem AG, Aqlan F, Shamsan A, Alhendi M. A simulation-optimisation approach for production control strategies in perishable food supply chains. Journal of Simulation. 2023; 17(2):211-27. DOI: 10.1080/17477778.2021.1991850
Lopez-Landeros CE, Valenzuela-Gonzalez R, Olivares-Benitez E. Dynamic optimization of a supply chain operation model with multiple products. Mathematics. 2024; 12(15). DOI: 10.3390/math12152420
Guo YR, Shi Q, Guo CM. Multi-period spare parts supply chain network optimization under (T, s, S) inventory control policy with improved dynamic particle swarm optimization. Electronics. 2022; 11(21). DOI: 10.3390/electronics11213454
Zhang YY, Chai Y, Ma L. Research on multi-echelon inventory optimization for fresh products in supply chains. Sustainability. 2021; 13(11). DOI: 10.3390/su13116309
Xia YX, Li CC. Robust control strategy for dual-channel supply chain with free riding behavior and cross-channel return. IEEE Access. 2023; 11:144953-65. DOI: 10.1109/access.2023.3346676
Wu YN, Hao T, Jing Z, Ding W, Hao W. Research on optimization of supply chain inventory system under contingency conditions. Rairo-Operations Research. 2024; 58(2):1771-88. DOI: 10.1051/ro/2024014
Tian R, Lu M, Wang HP, Wang B, Tang QX. IACPPO: A deep reinforcement learning-based model for warehouse inventory replenishment. Computers & Industrial Engineering. 2024; 187. DOI: 10.1016/j.cie.2023.109829
Saricioglu A, Genevois ME, Cedolin M. Analyzing one-step and multi-step forecasting to mitigate the bullwhip effect and improve supply chain performance. IEEE Access. 2024; 12:180161-74. DOI: 10.1109/access.2024.3510175
Zhao C, Li LY, Yang HX, He MK. Dynamic interactive control of inventory in a dual-channel supply chain under stochastic demand: Modeling and empirical studies. Journal of the Operational Research Society. 2022; 73(11):2412-30. DOI: 10.1080/01605682.2021.1992309
Ivanov D. Exiting the COVID-19 pandemic: after-shock risks and avoidance of disruption tails in supply chains. Annals of Operations Research. 2024; 335(3):1627-44. DOI: 10.1007/s10479-021-04047-7
Wang JY, Shum S, Feng GZ. Supplier's pricing strategy in the presence of consumer reviews. European Journal of Operational Research. 2022; 296(2):570-86. DOI: 10.1016/j.ejor.2021.04.008
Li MM, Mizuno S. Comparison of dynamic and static pricing strategies in a dual-channel supply chain with inventory control. Transportation Research Part E-Logistics and Transportation Review. 2022; 165. DOI: 10.1016/j.tre.2022.102843
Tang LN, Yang TO, Tu YL, Ma YZ. Supply chain information sharing under consideration of bullwhip effect and system robustness. Flexible Services and Manufacturing Journal. 2021; 33(2):337-80. DOI: 10.1007/s10696-020- 09384-6
Iqbal MW, Kang YC. Circular economy of food: A secondary supply chain model on food waste management incorporating IoT based technology. Journal of Cleaner Production. 2024; 435. DOI: 10.1016/j.jclepro.2024.140566
DOI: https://doi.org/10.31449/inf.v49i26.8396

This work is licensed under a Creative Commons Attribution 3.0 License.