Metric-Wise Comparative Analysis of Hybrid CNN–SRU/LSTM and Lightweight CNN–MIL Frameworks for Deployment-Oriented Video Anomaly Detection

Rajat Gupta; Nidhi Tyagi

doi:10.31449/inf.v50i13.14031

Abstract

Video anomaly detection is a critical component of intelligent surveillance systems, where detection accuracy, temporal stability, computational efficiency, and real-world deployment feasibility must be jointly considered. Existing studies frequently rely on ROC–AUC as the primary evaluation metric, providing limited insight into practical system performance. This study presents a structured metric-wise comparative analysis of hybrid CNN–SRU/LSTM architectures and lightweight CNN-based multiple instance learning (MIL) frameworks, based on systematically collected benchmark results from datasets such as UCF-Crime and ShanghaiTech. The analysis follows a literature-driven methodology and evaluates models across multiple dimensions, including AUC, false alarm rate (FAR), temporal stability, inference speed (FPS), computational footprint, and calibration reliability using expected calibration error (ECE). Deployment-oriented factors such as latency–performance trade-offs, cross-domain robustness, and scalability under limited labeled data are also examined. Results indicate that hybrid CNN–SRU/LSTM frameworks achieve approximately 85.9% AUC across benchmark datasets with strong temporal consistency, while CNN–MIL approaches maintain competitive accuracy (≈82–84.7%) with significantly higher efficiency (up to 72 FPS) and improved calibration (ECE reduced from ~0.17 to ~0.10). Transformer-based and vision–language models achieve slightly higher accuracy (>86% AUC) but operate at substantially lower frame rates (<12 FPS) and higher memory requirements (>800 MB). These findings highlight that marginal accuracy gains often incur substantial computational cost, emphasizing multi-metric evaluation and hardware-aware model selection for practical video anomaly detection systems.

References

[1] Gupta, R., Tyagi, N.: Hybrid CNN–SRU/LSTM with multiple instance learning for real-time video anomaly detection in surveillance. Signal, Image and Video Processing 20, 66 (2026). https://doi.org/10.1007/s11760-025-05072-w

[2] Gupta, R., Tyagi, N.: Lightweight CNN–MIL models for cross-domain video anomaly detection: A reproducible evaluation framework. Informatica 9(36) (2025). https://doi.org/10.31449/inf.v49i36.12037

[3] Amin, S. U., Abbas, M. S., Kim, B., Jung, Y., Seo, S.: Enhanced anomaly detection in pandemic surveillance videos: An attention approach with EfficientNet-B0 and CBAM integration. IEEE Access 12, 162697–162712 (2024).

[4] Amin, S. U., Kim, B., Jung, Y., Seo, S., Park, S.: Video anomaly detection utilizing efficient spatiotemporal feature fusion with 3D convolutions and long short-term memory modules. Advanced Intelligent Systems 6, 2300706 (2024).

[5] Biswal, M. R., Dev, P. P., Baliarsingh, S. K.: Weakly supervised temporal attention framework for video anomaly detection. In: Proc. International Conference on Computing Communication and Networking Technologies (ICCCNT), 1–7 (2024).

[6] Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6479–6488 (2018).

[7] Lei, T., Zhang, Y., Wang, S. I., Dai, H., Artzi, Y.: Simple recurrent units for highly parallelizable recurrence. arXiv:1709.02755 (2017).

[8] Alsulaimawi, Z.: Federated learning with anomaly detection via gradient and reconstruction analysis. arXiv:2403.10000 (2024).

[9] Pezze, D. D., Anello, E., Masiero, C., et al.: Continual learning approaches for anomaly detection. Evolving Systems 16, 111 (2025).

[10] Fan, Y., Yu, Y., Lu, W., Han, Y.: Weakly supervised video anomaly detection with snippet anomalous attention. IEEE Transactions on Circuits and Systems for Video Technology 34, 5480–5492 (2024).

[11] Cantone, M., Marrocco, C., Bria, A.: On the cross-dataset generalization of machine learning for network intrusion detection. arXiv:2402.10974 (2024).

[12] Cao, Y., Ma, Y., Zhu, Y., et al.: Revisiting streaming anomaly detection: Benchmark and evaluation. Artificial Intelligence Review 58 (2025).

[13] Xu, Y., Huang, B., Zhou, C., Wang, H., Li, X.: Video anomaly detection with hyperbolic graph embedding and masked normalizing flows. Electronics 13, 5013 (2024).

[14] Lv, H., Yue, Z., Sun, Q., et al.: Unbiased multiple instance learning for weakly supervised video anomaly detection. In: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8022–8031 (2023).

[15] Kasthuri, A., Balamurali, A., Srinivasan, A. K., Priya, S. S.: Jagriti via Mudra: A pose-based surveillance anomaly detection system. In: Proc. International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI), 1–6 (2024).

[16] Jhapate, A. K., Malviya, S., Jhapate, M.: Unusual crowd activity detection using motion influence maps. In: Proc. International Conference on Data, Engineering and Applications (IDEA), 1–6 (2020).

[17] Shin, J., Kaneko, Y., Miah, A. S. M., Hassan, N., Nishimura, S.: Anomaly detection in weakly supervised videos using multistage graphs and deep spatiotemporal feature enhancement. IEEE Access 12, 65213–65227 (2024).

[18] Abdalla, M., Javed, S., Radi, M. A., Ulhaq, A., Werghi, N.: Video anomaly detection in 10 years: A survey and outlook. arXiv:2405.19387 (2024).

Metric-Wise Comparative Analysis of Hybrid CNN–SRU/LSTM and Lightweight CNN–MIL Frameworks for Deployment-Oriented Video Anomaly Detection

Abstract

References

Authors

DOI:

Keywords:

Downloads

Published

Issue

Section

License

How to Cite

Developed By

Information