Spatiotemporal GAN-Based Real-Time Animation and Multi-Modal Interaction Optimization for Virtual Reality

Zheng Wang

doi:10.31449/inf.v49i35.11457

Spatiotemporal GAN-Based Real-Time Animation and Multi-Modal Interaction Optimization for Virtual Reality

Abstract

At present, animation generation and multi-modal interaction in virtual reality environments still face problems such as low generation quality, poor real-time performance and insufficient fusion between modes, which seriously restrict the authenticity and interaction efficiency of immersive experiences. A real-time synthesis and multi-modal interactive optimization method of generative adversarial network animation for a VR environment is proposed. In animation synthesis, a generation architecture with a spatiotemporal consistency adversarial training mechanism is constructed, and a multi-scale feature fusion strategy is combined to achieve high-quality and low-latency animation generation. Experiments were conducted on the VR (Virtual Reality)-Gesture-Voice dataset (60,000 training samples, 15,000 testing samples) and benchmarked against state-of-the-art (SOTA) models including VideoGAN and StyleGAN3. Key results: For the isolated GAN synthesis module: average rendering frame rate = 23.7 FPS (58% higher than VideoGAN), synthesis delay ≤ 4.5 ms; For the end-to-end VR system: average rendering frame rate = 85–92 FPS (meeting VR’s ≥72 FPS standard), end-to-end latency ≤17 ms (51% lower than StyleGAN3). In the real-time synthesis test of generative adversarial network (GAN) animation in the VR environment, two sets of key metrics are reported to clarify different system scopes: For the isolated GAN animation synthesis module (excluding end-to-end transmission and rendering), the improved algorithm achieved an average rendering frame rate of 23.7 FPS (58% higher than the traditional method) and controlled the synthesis delay within 4.5 Ms. Regarding system resource usage, GPU (Graphics Processing Unit） memory consumption is reduced by 0.6 GB, model reasoning time is reduced by 49.5%, and 85% real-time rendering efficiency can still be maintained at 8K resolution.

Authors

Zheng Wang Shandong Media Vocational College

DOI:

https://doi.org/10.31449/inf.v49i35.11457

Downloads

Published

12/16/2025

How to Cite

Wang, Z. (2025). Spatiotemporal GAN-Based Real-Time Animation and Multi-Modal Interaction Optimization for Virtual Reality. Informatica, 49(35). https://doi.org/10.31449/inf.v49i35.11457

Download Citation

Issue

Vol. 49 No. 35 (2025): Online-only issue

Section

Online-only

License

Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.

All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.

Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.

Spatiotemporal GAN-Based Real-Time Animation and Multi-Modal Interaction Optimization for Virtual Reality

Abstract

Authors

DOI:

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information