Hierarchical Multi-Agent Deep Reinforcement Learning for Coordinated Optimization of Aggregated Virtual Power Plants in Smart Microgrids
Abstract
Computational efficiency, Data privacy, and equitable benefit assessment are some of the issues that have arisen as a result of the fast expansion of distributed energy resources (DERs), which have added complexity to the functioning of distribution networks. This paper presents a two-tiered VPP coordination architecture that takes into account the operational interests of both Distribution System Operators (DSOs) or VPPs under AC optimum power flow (AC-OPF) limitations. The goal is to solve these challenges. A penalty-function-enhanced OPF mechanism is used in the upper layer to guarantee network security in the event of voltage or branch-limit violations, and an Asynchronous Advantage Actor-Critic (A3C) multi-agent architecture is integrated into the lower layer to utilize a parameter-sharing Twin-Delayed Deep Deterministic Policy Gradient (PS-TD3) algorithm. Through lightweight parameter sharing and decentralized execution, every agent— which represents a VPP subsystem—learns optimum judgments for energy-dispatch, storage, and flexibility, resulting in dramatically reduced computational cost and preservation of data privacy. When compared to the non-cooperative TD3 baseline, traditional distributed OPF, and independent Q-learning, the suggested dual-layer MARL approach outperforms all three in simulation tests conducted on the IEEE 33-node distribution network. Thanks to parameter sharing, the PS-TD3 + A3C hybrid improves convergence speed through 42% and reduces per-step computing time by 37%. It also reduces voltage variation by 31.4%, network real-power losses by 26.7%, and operating cost by 18.2%. Since agents only share compressed gradients and not raw operational data, privacy leakage is minimized by more than 80%. In contemporary distribution systems that are rich in distributed energy resources (DERs), the findings show that the suggested framework provides a computationally efficient, scalable, and privacy-preserving approach for coordinated VPP operation.DOI:
https://doi.org/10.31449/inf.v50i11.12252Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







