Joint Symbol-Text Parsing in Power Grid Blueprints via Multimodal Fusion Using YOLOv7, PP-OCRv3, and GCN

Abstract

Aiming to address the key needs for efficient analysis of blueprint information in the intelligent construction of power grid projects, this paper proposes a joint analysis algorithm for power grid blueprint symbols and texts based on multimodal fusion. This method designs a two-stream feature extraction and cross-modal alignment framework. Firstly, the YOLOv7 model and spatial pyramid pooling technology are adopted to enhance the detection ability of small-sized electrical symbols; Secondly, the high-precision PP-OCRv3 engine is used to realise character detection and recognition, and location coding is introduced to enhance its spatial perception. Finally, the symbol-text association matrix is constructed, and its topological connection relationship is modelled using a graph convolutional network (GCN). At the same time, an attention-guided feature fusion module (AG-Fusion) is designed to achieve dynamic weighted fusion of visual and textual features, thereby enabling joint parsing within the end-to-end process. To verify the effectiveness of the algorithm, this paper conducts a systematic experiment using the self-built power grid blueprint dataset, specifically GBD-1. 0, which contains 217 standard blueprints, 12 types of electrical symbols and 3862 text examples. The experimental results show that it achieves 93.7% mAP @ 0.5 in symbol detection, 95.4% F1 value in text recognition, and 89.2% accuracy in the most critical joint parsing. This algorithm resolves analysis ambiguity in complex scenarios, such as drawing occlusion and dense text, and provides reliable technical support for the digital construction of power grids.

Authors

  • Xufei Liu Power Dispatching and Control Center of Yunnan Power Grid Co., Ltd
  • Xiying Wang Power Dispatching and Control Center of Yunnan Power Grid Co., Ltd
  • Shuling Wang Power Dispatching and Control Center of Yunnan Power Grid Co., Ltd
  • Wenchao Qin Power Dispatching and Control Center of Yunnan Power Grid Co., Ltd
  • Yiran Tao Power Dispatching and Control Center of Yunnan Power Grid Co., Ltd

DOI:

https://doi.org/10.31449/inf.v50i7.12060

Downloads

Published

02/21/2026

How to Cite

Liu, X., Wang, X., Wang, S., Qin, W., & Tao, Y. (2026). Joint Symbol-Text Parsing in Power Grid Blueprints via Multimodal Fusion Using YOLOv7, PP-OCRv3, and GCN. Informatica, 50(7). https://doi.org/10.31449/inf.v50i7.12060