A Multi-modal Diffusion Model-Based Digital Twin Framework for Stadium Management via IoT Data Fusion
Abstract
This study proposes a sports venue digital twin system construction method that integrates multi-modal diffusion model and Internet of Things data, aiming to achieve high-precision modeling and intelligent prediction of venue status. In terms of system architecture, the framework consists of four layers—perception, data processing, modeling, and application—forming a closed-loop of perception–fusion–modeling–feedback. The experimental setup involved a multimodal dataset comprising over 50,000 high-resolution monitoring images, 8,000+ daily sensor records (temperature, humidity, CO₂, light, and noise), 15,000 text logs, and crowd/environmental audio spectrograms, collected with a sensor network deployed at 1–5 s intervals. By integrating these multimodal streams, the diffusion model achieved semantic fusion and predictive reconstruction with high robustness. For benchmarking, our method was compared against CNN, GNN, and SVM baselines, as well as Transformer-based multimodal fusion and Graph Attention Networks (GATs). In terms of performance, the multimodal diffusion model reduced image, speech, and text processing times from 122 ms, 96 ms, and 78 ms of CNN-based models to 78 ms, 65 ms, and 49 ms, with overall latency reduced by 35.1%. The overall sensor data integrity rate exceeded 98% (pedestrian flow sensor at 99.53%). Regarding digital twin modeling accuracy, the spatial restoration accuracy reached 96.3%, motion trajectory simulation 94.7%, and environmental prediction 93.5%, with an average accuracy of 94.8%, consistently outperforming baseline approaches. The multi-modal diffusion model constructed in this research institute and the digital twin system collaborated with IoT perform well in terms of perception fusion, scene prediction and interaction performance, providing a strong theoretical basis and engineering support for the intelligent operation of sports venues.DOI:
https://doi.org/10.31449/inf.v49i28.10300Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika







