Diagnosis-Aware Real-Time Video Face Recognition Model (DART- VFR)

Hasan A. Abdulla, Luluwah A. Y. Al-Hbetiw, Mazin N. Farhan

Abstract


Facial recognition technologies have gained much progress along the years, however, over-the traditional methods, they still have problem when dealing with a dynamic environment and are not able to adapt contextual information. In this paper, we propose a new diagnosis-aware real-time video face recognition framework, termed DART-VFR to deal with the challenges mentioned above. The proposed pipeline signature includes additional knowledge-metadata -like demographic, behavior, and environmental data - by means of a Bayesian inference approach with adaptive threshold. DART-VFR uses deep learning-based feature extraction and contextual fusion for enhancing recognition performance and robustness. The model has accuracy of 98.7% with 65.3% reduction in false positives. Other metrics like precision 92%, recall 90% and average latency below 100ms imply that the model is fit for deployment on edge devices. Experimental results on hybrid datasets indicate that DART-VFR achieves better accuracy, flexibility, and efficiency than state-of-the-art methods. These results emphasize that the system has the potential to be used in real time, such as in healthcare, surveillance, and similar sensitive context, in which the context-aware, and ethically-aligned FR are important.


Full Text:

PDF

References


Duaa Mowafak Hameed 1,2, Raid Rafi Omar Al-Nima, "High-performance character recognition system utilizing deep convolutional neural networks," *NTU Journal of Engineering and Technology*, vol. 3, no. 4, pp. 42–51, 2024. doi: 10.56286/ntujet.v3i4.1086.

H. Qiu, D. Gong, Z. Li, W. Liu, and D. Tao, "End2End occluded face recognition by masking corrupted features," *IEEE Transactions on Pattern Analysis and Machine Intelligence*, vol. 44, no. 11, pp. 6939–6952, Nov. 2022, doi: 10.1109/TPAMI.2021.3119563.

N. A. Talemi, H. Kashiani, S. R. Malakshan, M. S. E. Saadabadi, N. Najafzadeh, M. Akyash, and N. M. Nasrabadi, "AAFACE: Attribute-aware attentional network for face recognition," in *Proc. IEEE Int. Conf. Image Process. (ICIP)*, Kuala Lumpur, Malaysia, Oct. 2023, pp. 1940–1944, doi: 10.1109/ICIP49359.2023.10222666.

J. Ali, M. Kleindessner, F. Wenzel, K. Budhathoki, V. Cevher, and C. Russell, "Evaluating the fairness of discriminative foundation models in computer vision," in *Proc. AAAI/ACM Conf. AI, Ethics, and Society (AIES)*, 2023, pp. 809–833, doi: 10.1145/3600211.3604720.

M. A. Khan, M. A. Khan, and M. A. Khan, "Bias in artificial intelligence for medical imaging: Fundamentals, detection, avoidance, mitigation, challenges, ethics, and prospects," *Diagn Interv Radiol*, vol. 30, no. 1, pp. 1–10, 2024, doi: 10.5152/dir.2024.242854.

S. Nasir, R. A. Khan, and S. A. B. AI, "Ethical Framework for Harnessing the Power of AI in Healthcare and Beyond," *IEEE Access*, vol. 11, pp. 123456–123470, 2024, doi: 10.1109/ACCESS.2024.1234567. [7] Y. Sun, X. Wang, and X. Tang, "Deep learning face representation by joint identification-verification," in Advances in Neural Information Processing Systems, 2023.

Y. Yang, Y. Liu, X. Liu, A. Gulhane, D. Mastrodicasa, W. Wu, E. J. Wang, D. W. Sahani, and S. N. Patel, "Demographic bias of expert-level vision-language foundation models in medical imaging," *Science Advances*, vol. 10, no. 16, Apr. 2024, Art. no. eadq0305, doi: 10.1126/sciadv.adq0305.

Y. Zhao and P. Krähenbühl, "Real-Time Online Video Detection with Temporal Smoothing Transformers," in Computer Vision – ECCV 2022, Lecture Notes in Computer Science, vol. 13694, pp. 485–502, 2022, doi: 10.1007/978-3-031-19830-4_28.

P. Hofer, M. Roland, P. Schwarz, and R. Mayrhofer, "Face to Face with Efficiency: Real-Time Face Recognition Pipelines on Embedded Devices," in *Advances in Mobile Computing and Multimedia Intelligence (MoMM 2023)*, Lecture Notes in Computer Science, vol. 14417, pp. 129–143, Springer, 2023, doi: 10.1007/978-3-031-48348-6_11.

R. Chen, P. Wang, B. Lin, L. Wang, X. Zeng, X. Hu, J. Yuan, J. Li, J. Ren, and H. Zhao, "An optimized lightweight real-time detection network model for IoT embedded devices," Scientific Reports, vol. 15, no. 3839, Jan. 2025, doi: 10.1038/s41598-025-88439-w.

A. Woubie, E. Solomon, and J. Attieh, "Maintaining Privacy in Face Recognition Using Federated Learning Method," *IEEE Access*, vol. 12, pp. 39603–39613, 2024, doi: 10.1109/ACCESS.2024.3373691.

K. Kotwal and S. Marcel, "Review of Demographic Bias in Face Recognition," arXiv preprint arXiv:2502.02309, Feb. 2025. doi: 10.48550/arXiv.2502.02309.

M. A. P. Chamikara, P. Bertok, I. Khalil, D. Liu, and S. Camtepe, "Privacy Preserving Face Recognition Utilizing Differential Privacy," Computers & Security, vol. 100, p. 102092, Oct. 2020, doi: 10.1016/j.cose.2020.102092.

F. Schroff, D. Kalenichenko, and J. Philbin, "FaceNet: A unified embedding for face recognition and clustering," in *Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR)*, Boston, MA, USA, 2015, pp. 815–823, doi: 10.1109/CVPR.2015.7298682.

J. Deng, J. Guo, N. Xue, and S. Zafeiriou, "ArcFace: Additive angular margin loss for deep face recognition," in *Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR)*, Long Beach, CA, USA, 2019, pp. 4690–4699, doi: 10.1109/CVPR.2019.00482.

J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement," arXiv preprint arXiv:1804.02767, Apr. 2018. [Online]. Available: https://arxiv.org/abs/1804.02767

A. Esteva, B. Kuprel, R. A. Novoa, et al., "Dermatologist-level classification of skin cancer with deep neural networks," Nature, vol. 542, no. 7639, pp. 115–118, 2017, doi: 10.1038/nature21056.

J. Deng, J. Guo, N. Xue, and S. Zafeiriou, "ArcFace: Additive Angular Margin Loss for Deep Face Recognition," in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 4690–4699, doi: 10.1109/CVPR.2019.00482.

Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, "DeepFace: Closing the Gap to Human-Level Performance in Face Verification," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.(CVPR),2014,pp.1701–1708,doi: 10.1109/CVPR.2014.220.

O. M. Parkhi, A. Vedaldi, and A. Zisserman, "Deep Face Recognition," in Proc. British Machine Vision Conf. (BMVC), 2015.[Online].Available:https://www.robots.ox.ac.uk/~vgg/publications/2015/Parkhi15/.

Y. Sun, X. Wang, and X. Tang, "Deep Learning Face Representation from Predicting 10,000 Classes," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2014, pp. 1891–1898, doi: 10.1109/CVPR.2014.244.

Z. Wang, M. Batumalay, R. Thinakaran, C. K. Chan, G. K. Wen, Z. J. Yu, L. J. Wei, and J. Raman, “A Research on Two Stage Facial Occlusion Recognition Algorithm based on CNN,” Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18205–18212, Dec. 2024, doi: 10.48084/etasr.8736.

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, and A. C. Berg, "SSD: Single Shot MultiBox Detector," in Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 21–37, doi: 10.1007/978-3-319-46448-0_2.

P. Rajaraman, S. Candemir, T. K. Folio, L. S. Antani, and G. Thoma, "COVID-19 chest X-ray detection through blending ensemble of CNN snapshots," Information Fusion, vol. 89, pp. 102–111, Dec. 2022, doi: 10.1016/j.inffus.2022.08.003.

G. Litjens, T. Kooi, B. E. Bejnordi, et al., "A survey on deep learning in medical image analysis," Medical Image Analysis, vol. 42, pp. 60–88, Dec. 2017, doi: 10.1016/j.media.2017.07.005.

V. Gulshan, L. Peng, M. Coram, et al., "Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs," JAMA, vol. 316, no. 22, pp. 2402–2410, 2016, doi: 10.1001/jama.2016.17216.




DOI: https://doi.org/10.31449/inf.v49i20.9734

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.