A Hierarchical Attention-Based Heterogeneous Multi-Agent PPO Framework for Distributed Warehouse Scheduling

Abstract

The dynamic changes in the warehousing environment, the heterogeneity of task allocation, and the complexity of multi-agent collaboration make it difficult for traditional scheduling algorithms to meet the challenges of modern warehousing. This study proposes a heterogeneous multi-agent collaborative scheduling method based on an improved Proximal Policy Optimization (PPO) framework, which integrates a hierarchical attention-driven architecture and dynamic variance constraint algorithm to address the spatio-temporal coupling constraint problem of distributed warehousing scheduling. We designed a multi-objective reward function (considering task timeliness, energy consumption, and space utilization) and a dynamic computing resource allocation strategy to enhance the system’s efficiency and robustness in handling large-scale orders (500+ daily orders) and unexpected situations (e.g., equipment failure). Experimental data shows that compared with traditional scheduling algorithms, the completion rate of agent collaborative tasks has increased from 56.89% to 73.24%, the average task execution delay has dropped from 22.1 seconds to 15.67 seconds, and the storage space utilization rate has increased from 49.2% to 63.5%. In complex order scenarios, the framework's sorting accuracy rate for multiple types of goods reaches 94.5%, which is 37.67 percentage points higher than the baseline model, and the proportion of multi-agent communication overhead in system resources has dropped from 88.76% to 63.5%, which verifies the algorithm's optimization capabilities under resource constraints.

Authors

  • Yuzhang Huang School of Business, Fuzhou Polytechnic Institute, Fuzhou, 350108, China

DOI:

https://doi.org/10.31449/inf.v50i11.12435

Downloads

Published

04/23/2026

How to Cite

Huang, Y. (2026). A Hierarchical Attention-Based Heterogeneous Multi-Agent PPO Framework for Distributed Warehouse Scheduling. Informatica, 50(11). https://doi.org/10.31449/inf.v50i11.12435