MTCNN-UGAN: A Self-Attention Enhanced Face Replacement Pipeline for Film and Television Video Frames
Abstract
At present, face replacement technology in film and television videos faces problems such as low accuracy and high resource consumption. Research proposes an automated face replacement technology that integrates improved multi task cascaded convolutional neural networks (MTCNN) and generative adversarial networks (GAN). In the face detection stage, MTCNN is used, and a median filter preprocessing and depthwise separable convolution model are introduced. In the face replacement stage, a U-Net based generative adversarial network (UGAN) is constructed, whose generator consists of an encoder and a decoder, and is embedded with a dual skip connection residual module. The discriminator adopts a self attention mechanism and a video stabilization module. In the experiment, WIDER FACE and Celeb Faces Attributes Dataset (CelebA) were used for face detection tasks. The face replacement task used a high-resolution Celebrity Mask High Quality (CelebAMask HQ) dataset and a Deepfake Model Attribution Dataset (FDM). Meanwhile, the study introduced FaceSwap technology and attribute preserving generative adversarial network (AP-GAN) as comparative baselines. In face detection experiments, the research model performed best in terms of accuracy as well as training loss in different face detection scenes. For example, the accuracy of the research model in complex scenes was 93.25%, and the training loss was 0.221. In the face replacement experiment, the model replaces faces in four image sets. Its color as well as face contour structure was well preserved and face replacement was more natural. In the similarity index comparison, the research model performed the highest face replacement similarity index at different frame numbers with an average value of 0.994. The research model also performed the best in the face replacement imaging peak signal-to-noise ratio test with an average value of 35.65. Finally, in the face replacement composite test, the research model performed the best in both structural similarity and state error. In conclusion, the technique has good application results. This study can provide technical support for the improvement of face replacement technology as well as face characterization.References
Zeng H, Zhang W, Fan C, Lv T, Wang S. FlowFace: semantic flow-guided shape-aware face-swapping[C]//Proceedings of the AAAI conference on artificial intelligence. 2023, 37(3): 3367-3375.
Rehaan M, Kaur N, Kingra S. Face manipulated deepfake generation and recognition approaches: A survey. Smart Science, 2024, 12(1): 53-73.
Kaddar B, Fezza S A, Hamidouche W. On the effectiveness of handcrafted features for deepfake video detection. Journal of Electronic Imaging, 2023, 32(5): 53033-53033.
Rao I S S, Kumar J S, Vamsi T M N, Kumar TR. IMPROVING REALISM IN face-swapping USING DEEP LEARNING AND K-MEANS CLUSTERING. Proceedings on Engineering, 2024, 6(4): 1751-1756.
Omar K, Sakr R H, Alrahmawy M F. An ensemble of CNNs with self-attention mechanism for DeepFake video detection. Neural Computing and Applications, 2024, 36(6): 2749-2765.
Bansal N, Aljrees T, Yadav D P, Singh KU, Kumar A. Real-time advanced computational intelligence for deep fake video detection. Applied Sciences, 2023, 13(5): 3095-3098.
Tsai C S, Wu H C, Chen W T, Ying JJC. ATFS: A deep learning framework for angle transformation and face-swapping of face de-identification. Multimedia Tools and Applications, 2024, 83(12): 36797-36822.
Ikram S T, Chambial S, Sood D. A performance enhancement of deepfake video detection through the use of a hybrid CNN Deep learning model. International journal of electrical and computer engineering systems, 2023, 14(2): 169-178.
Liao X, Wang Y, Wang T, Hu J, Wu X. FAMM: Facial muscle motions for detecting compressed deepfake videos over social networks. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(12): 7236-7251.
Akhtar Z, Pendyala T L, Athmakuri V S. Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve. Forensic Sciences, 2024, 4(3): 289-377.
Zhao H, Zhou W, Chen D, Zhang W, Guo Y. Audio-Visual Contrastive Pre-train for Face Forgery Detection. ACM Transactions on Multimedia Computing, Communications and Applications, 2024, 21(2): 1-16.
Yue P, Chen B, Fu Z. Local region frequency guided dynamic inconsistency network for deepfake video detection. Big Data Mining and Analytics, 2024, 7(3): 889-904.
Wöhler L, Ikehata S, Aizawa K. Investigating the Perception of Facial Anonymization Techniques in 360° Videos. ACM Transactions on Applied Perception, 2024, 21(4): 1-17.
Salini Y, HariKiran J. Deepfake videos detection using crowd computing. International Journal of Information Technology, 2024, 16(7): 4547-4564.
Yang G, Xu K, Fang X, Zhang J. Video face forgery detection via facial motion-assisted capturing dense optical flow truncation. The Visual Computer, 2023, 39(11): 5589-5608.
Pang G, Zhang B, Teng Z, Qi Z. MRE-Net: Multi-rate excitation network for deepfake video detection. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(8): 3663-3676.
Deng L, Wang J, Liu Z. Cascaded network based on EfficientNet and transformer for deepfake video detection. Neural Processing Letters, 2023, 55(6): 7057-7076.
Tyagi S, Yadav D. A detailed analysis of image and video forgery detection techniques. The Visual Computer, 2023, 39(3): 813-833.
Lu T, Bao Y, Li L. Deepfake Video Detection Based on Improved CapsNet and Temporal–Spatial Features. Computers, Materials and Continua, 2023, 75(1): 715-740.
Glaser M, Reisinger H, Florack A. You are my friend, but we are from different worlds: Actor-type effects on audience engagement in narrative video advertisements. Journal of Advertising, 2024, 53(4): 568-587.
DOI:
https://doi.org/10.31449/inf.v49i5.8927Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika







