BERT-GAT: Hierarchical Feature Interaction with Dynamic Multi-Hop Attention for Unstructured Data Management
Abstract
At present, unstructured data is growing rapidly, but traditional methods struggle to capture both deep semantics and complex structural relationships. This paper proposes a BERT-GAT fusion architecture to address this gap. We use BERT-base for semantic encoding (capturing contextual features) and a standard GAT with 2 layers and 8 attention heads for structural modeling. The architecture integrates a hierarchical feature interaction layer (fusing multi-granularity semantics) and a dynamic multi-hop attention module (modeling long-distance dependencies). Experiments are conducted on a proprietary dataset of 999,000 unstructured texts from a power grid management system (training/test split: 8:2). Evaluation metrics include accuracy (P), recall (R), and F1-score, with baselines including CNN, BERT-base, and GAT alone. Results show the fusion architecture achieves 87.0% accuracy (23.5% higher than CNN, p<0.01), 45.67% recall (12 percentage points higher than BERT-base, p<0.05), and an F1-score of 0.75 higher than BERT alone. The average retrieval response time is 56.2±3.1 seconds (on dual NVIDIA A100 GPUs). This work provides a robust framework for unstructured data management by integrating semantic and structural modeling.DOI:
https://doi.org/10.31449/inf.v49i20.10546Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







