Hierarchical Multi-Agent Deep Reinforcement Learning for Coordinated Optimization of Aggregated Virtual Power Plants in Smart Microgrids

Junxiong Zhang; Jiao Wang; Qihang Wang; Qi Zhou

doi:10.31449/inf.v50i11.12252

Hierarchical Multi-Agent Deep Reinforcement Learning for Coordinated Optimization of Aggregated Virtual Power Plants in Smart Microgrids

Abstract

Computational efficiency, Data privacy, and equitable benefit assessment are some of the issues that have arisen as a result of the fast expansion of distributed energy resources (DERs), which have added complexity to the functioning of distribution networks. This paper presents a two-tiered VPP coordination architecture that takes into account the operational interests of both Distribution System Operators (DSOs) or VPPs under AC optimum power flow (AC-OPF) limitations. The goal is to solve these challenges. A penalty-function-enhanced OPF mechanism is used in the upper layer to guarantee network security in the event of voltage or branch-limit violations, and an Asynchronous Advantage Actor-Critic (A3C) multi-agent architecture is integrated into the lower layer to utilize a parameter-sharing Twin-Delayed Deep Deterministic Policy Gradient (PS-TD3) algorithm. Through lightweight parameter sharing and decentralized execution, every agent— which represents a VPP subsystem—learns optimum judgments for energy-dispatch, storage, and flexibility, resulting in dramatically reduced computational cost and preservation of data privacy. When compared to the non-cooperative TD3 baseline, traditional distributed OPF, and independent Q-learning, the suggested dual-layer MARL approach outperforms all three in simulation tests conducted on the IEEE 33-node distribution network. Thanks to parameter sharing, the PS-TD3 + A3C hybrid improves convergence speed through 42% and reduces per-step computing time by 37%. It also reduces voltage variation by 31.4%, network real-power losses by 26.7%, and operating cost by 18.2%. Since agents only share compressed gradients and not raw operational data, privacy leakage is minimized by more than 80%. In contemporary distribution systems that are rich in distributed energy resources (DERs), the findings show that the suggested framework provides a computationally efficient, scalable, and privacy-preserving approach for coordinated VPP operation.

Authors

Junxiong Zhang Anhui Technical College of Industry and Economy, China
Jiao Wang Anhui Jianzhu University, China
Qihang Wang Anhui Technical College of Industry and Economy, China
Qi Zhou Anhui Technical College of Industry and Economy, China

DOI:

https://doi.org/10.31449/inf.v50i11.12252

Downloads

Published

04/23/2026

How to Cite

Zhang, J., Wang, J., Wang, Q., & Zhou, Q. (2026). Hierarchical Multi-Agent Deep Reinforcement Learning for Coordinated Optimization of Aggregated Virtual Power Plants in Smart Microgrids. Informatica, 50(11). https://doi.org/10.31449/inf.v50i11.12252

Download Citation

Issue

Vol. 50 No. 11 (2026): Online-only issue

Section

Online-only

License

Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.

All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.

Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.

Hierarchical Multi-Agent Deep Reinforcement Learning for Coordinated Optimization of Aggregated Virtual Power Plants in Smart Microgrids

Abstract

Authors

DOI:

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information