Efficient Sparse Input Scene Reconstruction and Real-Time Rendering for VR Advertising Using Optimized NeRF Framework
Abstract
In response to the core demands of virtual reality (VR) advertising scenarios for high visual fidelity, multi-view coverage, and real-time interaction, traditional 3D reconstruction methods have complex modeling processes (requiring professionals to complete it in several weeks) and insufficient support for dynamic elements (stiff transitions in keyframe animations). Moreover, the original model of neural radiation Field (NeRF) has key bottlenecks such as low generation efficiency (requiring hundreds of images and tens of hours of training) and high rendering overhead (single-frame rendering exceeds 200 ms). This paper proposes a full-chain solution of sparse input rapid reconstruction - lightweight dynamic modeling - hardware adaptation rendering optimization. Firstly, combined with the production characteristics of short cycle and low cost of VR advertising, a sparse NeRF reconstruction framework integrating depth prior and semantic guidance is proposed. Through hierarchical initialization and joint optimization strategies, high-fidelity scene reconstruction is achieved with 15-20 image inputs. Secondly, in response to the dynamic product display requirements in advertisements, a dynamic modeling method based on local deformation fields is designed. Through spatial masks and temporal consistency constraints, the number of parameters is reduced by 95% compared with DyNeRF while ensuring dynamic smoothness (SSIM≥0.92). Finally, in view of the hardware characteristics of VR (limited computing power on mobile terminals and low latency requirements), a rendering optimization chain of explicit baking - LOD adaptation - GPU parallelization is proposed to achieve stable rendering at 90fps on the Standalone VR platform (Pico 4). The experiment was verified based on the PC/Standalone dual platforms, the self-built VR-AD-12 dataset (12 advertising scenes), and public datasets (Tanks and Temples, DTU). The results show that the scene generation time of the method proposed in this paper is shortened by 62.3% compared with the original NeRF (24±1.2 h vs. 3.6±0.2 h), the PSNR of dynamic modeling reaches 32.1±0.3 dB, the rendering delay is less than 18 ms (17.8±0.5 ms on Pico 4), the LPIPS is 0.09±0.01, and the FID is 22.3±1.2. Core indicators are superior to existing methods, and the method is adapted to the industrial production requirements of VR advertising.References
Cowan, K., Plangger, K., & Javornik, A. (2024). Insights for Advertisers on Immersive Technologies: The Future of Ads Using VR, AR, MR and the Metaverse. Journal of Advertising Research, 64(3), 249-254. https://doi.org/10.2501/JAR-2024-023
Hu, Y. (2022). Impact of VR virtual reality technology on traditional video advertising production. Advances in Computer, Signals and Systems, 6(3), 57-66. https://doi.org/10.23977/acss.2022.060307
Li, C., Li, S., Zhao, Y., Zhu, W., & Lin, Y. (2022). Rt-nerf: Real-time on-device neural radiance fields towards immersive ar/vr rendering. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 1-9. https://doi.org/10.1145/3508352.3549380
Park, M., Yoo, B., Moon, J. Y., & Seo, J. H. (2022). InstantXR: Instant XR environment on the web using hybrid rendering of cloud-based NeRF with 3d assets. In Proceedings of the 27th International Conference on 3D Web Technology. 1-9. https://doi.org/10.1145/3564533.3564565
Li, S., Li, C., Zhu, W., Yu, B., Zhao, Y., Wan, C., et al. (2023). Instant-3d: Instant neural radiance field training towards on-device ar/vr 3d reconstruction. In Proceedings of the 50th Annual International Symposium on Computer Architecture. 1-13. https://doi.org/10.1145/3579371.3589115
Gu, J., Jiang, M., Li, H., Lu, X., Zhu, G., Shah, S. A. A., et al. (2023). Ue4-nerf: Neural radiance field for real-time rendering of large-scale scene. Advances in Neural Information Processing Systems, 36, 59124-59136. https://doi.org/10.48550/arXiv.2310.13263
Song, K., Zeng, X., Ren, C., & Zhang, J. (2024). City-on-Web: real-time neural rendering of large-scale scenes on the web. In European Conference on Computer Vision, 385-402. https://doi.org/10.1007/978-3-031-72970-6_22
Mhaidli, A. H., & Schaub, F. (2021). Identifying manipulative advertising techniques in xr through scenario construction. In Proceedings of the 2021 Chi Conference on Human Factors in Computing Systems. 1-18. https://doi.org/10.1145/3411764.3445253
Zhan, Y., Li, Z., Niu, M., Zhong, Z., Nobuhara, S., Nishino, K., & Zheng, Y. (2024). KFD-NeRF: Rethinking dynamic NeRF with Kalman filter. In European Conference on Computer Visio. 1-18. https://doi.org/10.1007/978-3-031-72995-9_1
Liu, J. W., Cao, Y. P., Wu, J. Z., Mao, W., Gu, Y., Zhao, R., et al. (2024). Dynvideo-e: Harnessing dynamic nerf for large-scale motion-and view-change human-centric video editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7664-7674. https://doi.org/10.48550/arXiv.2310.10624
Zhang, B., Li, J., Shi, Y., Han, Y., & Hu, Q. (2025). AdvNeRF: Generating 3D adversarial meshes with NeRF to fool driving vehicles. IEEE Transactions on Information Forensics and Security. https://doi.org/10.1109/TIFS.2025.3609180
Kim, D. H., Jeong, J. Y., Lee, G., & Kim, J. G. (2024). Compression method of NeRF model using NNC and VVC. In International Workshop on Advanced Imaging Technology. 13164, 585-590. https://doi.org/10.1117/12.3019533
Qin, T., Li, C., Ye, H., Wan, S., Li, M., Liu, H., & Yang, M. (2024). Crowd-sourced nerf: Collecting data from production vehicles for 3d street view reconstruction. IEEE Transactions on Intelligent Transportation Systems, 25(11), 16145-16156. https://doi.org/10.1109/TITS.2024.3415394
Chen, Y., Li, Z., Lyu, D., Xu, Y., & He, G. (2025). Neural rendering acceleration with deferred neural decoding and voxel-centric data flow. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. https://doi.org/10.1109/TCAD.2024.3524918
Chen, Y., Zhang, L., Zhao, S., & Zhou, Y. (2025). ATM-NeRF: accelerating training for NeRF rendering on mobile devices via geometric regularization. IEEE Transactions on Multimedia. https://doi.org/10.1109/TMM.2025.3535288
Liu, S., Yang, M., Xing, T., & Yang, R. (2025). A survey of 3D reconstruction: the evolution from multi-view geometry to NeRF and 3DGS. Sensors, 25(18), 5748. https://doi.org/10.3390/s25185748
DOI:
https://doi.org/10.31449/inf.v50i9.12804Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







