Metric-Wise Comparative Analysis of Hybrid CNN–SRU/LSTM and Lightweight CNN–MIL Frameworks for Deployment-Oriented Video Anomaly Detection
Abstract
Video anomaly detection is a critical component of intelligent surveillance systems, where detection accuracy, temporal stability, computational efficiency, and real-world deployment feasibility must be jointly considered. Existing studies frequently rely on ROC–AUC as the primary evaluation metric, providing limited insight into practical system performance. This study presents a structured metric-wise comparative analysis of hybrid CNN–SRU/LSTM architectures and lightweight CNN-based multiple instance learning (MIL) frameworks, based on systematically collected benchmark results from datasets such as UCF-Crime and ShanghaiTech. The analysis follows a literature-driven methodology and evaluates models across multiple dimensions, including AUC, false alarm rate (FAR), temporal stability, inference speed (FPS), computational footprint, and calibration reliability using expected calibration error (ECE). Deployment-oriented factors such as latency–performance trade-offs, cross-domain robustness, and scalability under limited labeled data are also examined. Results indicate that hybrid CNN–SRU/LSTM frameworks achieve approximately 85.9% AUC across benchmark datasets with strong temporal consistency, while CNN–MIL approaches maintain competitive accuracy (≈82–84.7%) with significantly higher efficiency (up to 72 FPS) and improved calibration (ECE reduced from ~0.17 to ~0.10). Transformer-based and vision–language models achieve slightly higher accuracy (>86% AUC) but operate at substantially lower frame rates (<12 FPS) and higher memory requirements (>800 MB). These findings highlight that marginal accuracy gains often incur substantial computational cost, emphasizing multi-metric evaluation and hardware-aware model selection for practical video anomaly detection systems.References
[1] Gupta, R., Tyagi, N.: Hybrid CNN–SRU/LSTM with multiple instance learning for real-time video anomaly detection in surveillance. Signal, Image and Video Processing 20, 66 (2026). https://doi.org/10.1007/s11760-025-05072-w
[2] Gupta, R., Tyagi, N.: Lightweight CNN–MIL models for cross-domain video anomaly detection: A reproducible evaluation framework. Informatica 9(36) (2025). https://doi.org/10.31449/inf.v49i36.12037
[3] Amin, S. U., Abbas, M. S., Kim, B., Jung, Y., Seo, S.: Enhanced anomaly detection in pandemic surveillance videos: An attention approach with EfficientNet-B0 and CBAM integration. IEEE Access 12, 162697–162712 (2024).
[4] Amin, S. U., Kim, B., Jung, Y., Seo, S., Park, S.: Video anomaly detection utilizing efficient spatiotemporal feature fusion with 3D convolutions and long short-term memory modules. Advanced Intelligent Systems 6, 2300706 (2024).
[5] Biswal, M. R., Dev, P. P., Baliarsingh, S. K.: Weakly supervised temporal attention framework for video anomaly detection. In: Proc. International Conference on Computing Communication and Networking Technologies (ICCCNT), 1–7 (2024).
[6] Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6479–6488 (2018).
[7] Lei, T., Zhang, Y., Wang, S. I., Dai, H., Artzi, Y.: Simple recurrent units for highly parallelizable recurrence. arXiv:1709.02755 (2017).
[8] Alsulaimawi, Z.: Federated learning with anomaly detection via gradient and reconstruction analysis. arXiv:2403.10000 (2024).
[9] Pezze, D. D., Anello, E., Masiero, C., et al.: Continual learning approaches for anomaly detection. Evolving Systems 16, 111 (2025).
[10] Fan, Y., Yu, Y., Lu, W., Han, Y.: Weakly supervised video anomaly detection with snippet anomalous attention. IEEE Transactions on Circuits and Systems for Video Technology 34, 5480–5492 (2024).
[11] Cantone, M., Marrocco, C., Bria, A.: On the cross-dataset generalization of machine learning for network intrusion detection. arXiv:2402.10974 (2024).
[12] Cao, Y., Ma, Y., Zhu, Y., et al.: Revisiting streaming anomaly detection: Benchmark and evaluation. Artificial Intelligence Review 58 (2025).
[13] Xu, Y., Huang, B., Zhou, C., Wang, H., Li, X.: Video anomaly detection with hyperbolic graph embedding and masked normalizing flows. Electronics 13, 5013 (2024).
[14] Lv, H., Yue, Z., Sun, Q., et al.: Unbiased multiple instance learning for weakly supervised video anomaly detection. In: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8022–8031 (2023).
[15] Kasthuri, A., Balamurali, A., Srinivasan, A. K., Priya, S. S.: Jagriti via Mudra: A pose-based surveillance anomaly detection system. In: Proc. International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI), 1–6 (2024).
[16] Jhapate, A. K., Malviya, S., Jhapate, M.: Unusual crowd activity detection using motion influence maps. In: Proc. International Conference on Data, Engineering and Applications (IDEA), 1–6 (2020).
[17] Shin, J., Kaneko, Y., Miah, A. S. M., Hassan, N., Nishimura, S.: Anomaly detection in weakly supervised videos using multistage graphs and deep spatiotemporal feature enhancement. IEEE Access 12, 65213–65227 (2024).
[18] Abdalla, M., Javed, S., Radi, M. A., Ulhaq, A., Werghi, N.: Video anomaly detection in 10 years: A survey and outlook. arXiv:2405.19387 (2024).
DOI:
https://doi.org/10.31449/inf.v50i13.14031Keywords:
Array, Array, Array, Array, Array, Array, Array, ArrayDownloads
Published
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







