Multi-Objective Hierarchical Reinforcement Learning with Online Meta-Learning for Dynamic Pricing Strategy Optimization

Guangzeng Zhang

Abstract


In this study, we propose a Multi-Objective Hierarchical Reinforcement Learning (MOHRL) approach with online meta-learning for dynamic pricing strategy optimization. Our method utilizes hierarchical RL layers to decompose the pricing decision-making process and a meta-learning adapter to accelerate the cold start. We compare MOHRL with baseline methods like DQN and NSGA-II. Experimental results show that MOHRL outperforms DQN by 25% in profit and 18% in retention rate, and NSGA-II by 30% in market share over a 30-day simulation. The simulation system built based on 100,000+ SKU data of an e-commerce platform demonstrates MOHRL's superiority in real-time dynamic pricing, especially in cold start scenarios. Ablation experiments confirm that the meta-learning.


Full Text:

PDF


DOI: https://doi.org/10.31449/inf.v49i26.8595

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.