Deep Learning-based Cnn Multi-modal Camera Model Identification for Video Source Identification

Surjeet Singh; Vivek Kumar Sehgal

doi:10.31449/inf.v47i3.4392

Deep Learning-based Cnn Multi-modal Camera Model Identification for Video Source Identification

Surjeet Singh, Vivek Kumar Sehgal

Abstract

There is a high demand for multimedia forensics analysts to locate the original
camera of photographs and videos that are being taken nowadays. There has been considerable
progress in the technology of identifying the source of data, which has enabled conflict resolutions
involving copyright infringements and identifying those responsible for serious offences to be resolved.
This study focuses on the issue of identifying the camera model used to acquire video sequences
used in this research that is, identifying the type of camera used to capture the video sequence
under investigation. For this purpose, we created two distinct CNN-based camera model recognition
techniques to be used in an innovative multi-modal setting. The proposed multi-modal methods
combine audio and visual information in order to address the identification issue, which is superior
to mono-modal methods which use only the visual or audio information from the investigated video
to provide the identification information.According to legal standards of admissible evidence and
criminal procedure, Forensic Science involves the application of science to the legal aspects of criminal
and civil law, primarily during criminal investigations, in line with the standards of admissible
evidence and criminal procedure in the law. It is responsible for collecting, preserving, and analyzing
scientific evidence in the course of an investigation. It has become a critical part of criminology as a
result of the rapid rise in crime rates over the last few decades. Our proposed methods were tested
on a well-known dataset known as the Vision dataset, which contains about 2000 video sequences
gathered from various devices of varying types. It is conducted experiments on social media platforms
such as YouTube and WhatsApp as well as native videos directly obtained from their acquisition
devices by the means of their acquisition devices. According to the results of the study, the multimodal
approaches suggest that they greatly outperform their mono-modal equivalents in addressing
the challenge at hand, constituting an effective approach to address the challenge and offering the
possibility of even more difficult circumstances in the future

Full Text:

PDF

References

Sara Abdali. Multi-modal misinformation detection: Approaches, challenges and opportunities.

arXiv preprint arXiv:2203.13883, 2022.

Faseela Abdullakutty, Pamela Johnston, and Eyad Elyan. Fusion methods for face presentation

attack detection. Sensors, 22(14):5196, 2022.

Younes Akbari, Somaya Al-Maadeed, Noor Al-Maadeed, Afnan Al-Ali, Fouad Khelifi, Ashref

Lawgaly, et al. A new forensic video database for source smartphone identification: Description

and analysis. IEEE Access, 10:20080–20091, 2022.

Thangarajah Akilan, QM Jonathan Wu, Wei Jiang, Amin Safaei, and Jie Huo. New trend in

video foreground detection using deep learning. In 2018 IEEE 61St international midwest

symposium on circuits and systems (MWSCAS), pages 889–892. IEEE, 2018.

Md Hasan Al Banna, Md Ali Haider, Md Jaber Al Nahian, Md Maynul Islam, Kazi Abu Taher,

and M Shamim Kaiser. Camera model identification using deep cnn and transfer learning

approach. In 2019 International Conference on Robotics, Electrical and Signal Processing

Techniques (ICREST), pages 626–630. IEEE, 2019.

Irene Amerini, Aris Anagnostopoulos, Luca Maiano, Lorenzo Ricciardi Celsi, et al. Deep

learning for multimedia forensics. Foundations and Trends® in Computer Graphics and

Vision, 12(4):309–457, 2021.

Arselan Ashraf, Teddy Surya Gunawan, Bob Subhan Riza, Edy Victor Haryanto, and Zuriati

Janin. On the review of image and video-based depression detection using machine learning.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS),

(3):1677–1684, 2020.

Eleni Athanasiadou, Zeno Geradts, and Erwin Van Eijk. Camera recognition with deep learning.

Forensic sciences research, 3(3):210–218, 2018.

Sevinc Bayram, Husrev Sencar, Nasir Memon, and Ismail Avcibas. Source camera identification

based on cfa interpolation. In IEEE International Conference on Image Processing 2005,

volume 3, pages III–69. IEEE, 2005.

Guru Swaroop Bennabhaktula, Derrick Timmerman, Enrique Alegre, and George Azzopardi.

Source camera device identification from videos. SN Computer Science, 3(4):1–15, 2022.

Muhammad Tahir Bhatti, Muhammad Gufran Khan, Masood Aslam, and Muhammad Junaid

Fiaz. Weapon detection in real-time cctv videos using deep learning. IEEE Access, 9:34366–

, 2021.

Erik Blasch, Zheng Liu, and Yufeng Zheng. Advances in deep learning for infrared image

processing and exploitation. In Infrared Technology and Applications XLVIII, volume

, pages 368–383. SPIE, 2022.

Luca Bondi, Luca Baroffio, David G¨uera, Paolo Bestagini, Edward J Delp, and Stefano Tubaro.

First steps toward camera model identification with convolutional neural networks. IEEE

Signal Processing Letters, 24(3):259–263, 2016.

Gioele Ciaparrone, Francisco Luque S´anchez, Siham Tabik, Luigi Troiano, Roberto Tagliaferri,

and Francisco Herrera. Deep learning in video multi-object tracking: A survey. Neurocomputing,

:61–88, 2020.

Davide Dal Cortivo, Sara Mandelli, Paolo Bestagini, and Stefano Tubaro. Cnn-based multimodal

camera model identification on video sequences. Journal of Imaging, 7(8):135, 2021.

Avigyan Das, Pritam Sil, Pawan Kumar Singh, Vikrant Bhateja, and Ram Sarkar. Mmharensemnet:

A multi-modal human activity recognition model. IEEE Sensors Journal,

(10):11569–11576, 2020.

Haoqi Fan, Tullie Murrell, HengWang, Kalyan Vasudev Alwala, Yanghao Li, Yilei Li, Bo Xiong,

Nikhila Ravi, Meng Li, Haichuan Yang, et al. Pytorchvideo: A deep learning library

for video understanding. In Proceedings of the 29th ACM International Conference on

Multimedia, pages 3783–3786, 2021.

Anilkumar Gona and M Subramoniam. Convolutional neural network with improved feature

ranking for robust multi-modal biometric system. Computers and Electrical Engineering,

:108096, 2022.

Brian Hosler, Owen Mayer, Belhassen Bayar, Xinwei Zhao, Chen Chen, James A Shackleford,

and Matthew Christopher Stamm. A video camera model identification system using deep

learning and fusion. In ICASSP 2019-2019 IEEE International Conference on Acoustics,

Speech and Signal Processing (ICASSP), pages 8271–8275. IEEE, 2019.

Vinh-Nam Huynh and Hoang-Ha Nguyen. Fast pornographic video detection using deep learning.

In 2021 RIVF International Conference on Computing and Communication Technologies

(RIVF), pages 1–6. IEEE, 2021.

Li-Jia Li, Hao Su, Li Fei-Fei, and Eric Xing. Object bank: A high-level image representation

for scene classification & semantic feature sparsification. Advances in neural information

processing systems, 23, 2010.

Raquel Ramos L´opez, Elena Almaraz Luengo, Ana Lucila Sandoval Orozco, and Luis

Javier Garc´ıa Villalba. Digital video source identification based on container’s structure

analysis. IEEE Access, 8:36363–36375, 2020.

Jan Lukas, Jessica Fridrich, and Miroslav Goljan. Digital camera identification from sensor

pattern noise. IEEE Transactions on Information Forensics and Security, 1(2):205–214,

Luca Maiano, Irene Amerini, Lorenzo Ricciardi Celsi, and Aris Anagnostopoulos. Identification

of social-media platform of videos through the use of shared features. Journal of Imaging,

(8):140, 2021.

Jordan Ott, Abigail Atchison, Paul Harnack, Adrienne Bergh, and Erik Linstead. A deep

learning approach to identifying source code in images and video. In 2018 IEEE/ACM

th International Conference on Mining Software Repositories (MSR), pages 376–386.

IEEE, 2018.

Yagya Raj Pandeya and Joonwhoan Lee. Deep learning-based late fusion of multimodal information

for emotion classification of music video. Multimedia Tools and Applications,

(2):2887–2905, 2021.

Thuong-Cang Phan, Anh-Cang Phan, Hung-Phi Cao, and Thanh-Ngoan Trieu. Content-based

Deep Learning-based CNN Multi-Modal Camera Model Identification for Video Source Identificati2o1n

video big data retrieval with extensive features and deep learning. Applied Sciences,

(13):6753, 2022.

Jesus Salido, Vanesa Lomas, Jesus Ruiz-Santaquiteria, and Oscar Deniz. Automatic handgun

detection with deep learning in video surveillance images. Applied Sciences, 11(13):6085,

Daniel Schofield, Arsha Nagrani, Andrew Zisserman, Misato Hayashi, Tetsuro Matsuzawa, Dora

Biro, and Susana Carvalho. Chimpanzee face recognition from videos in the wild using deep

learning. Science advances, 5(9):eaaw0736, 2019.

Yan Shi and Subir Biswas. A deep-learning enabled traffic analysis engine for video source identification.

In 2019 11th International Conference on Communication Systems & Networks

(COMSNETS), pages 15–21. IEEE, 2019.

Yan Shi, Dezhi Feng, Yu Cheng, and Subir Biswas. A natural language-inspired multilabel video

streaming source identification method based on deep neural networks. Signal, Image and

Video Processing, 15(6):1161–1168, 2021.

Anahita Shojaei-Hashemi, Panos Nasiopoulos, James J Little, and Mahsa T Pourazad. Videobased

human fall detection in smart homes using deep learning. In 2018 IEEE International

Symposium on Circuits and Systems (ISCAS), pages 1–5. IEEE, 2018.

G Sreenu and Saleem Durai. Intelligent video surveillance: a review through deep learning

techniques for crowd analysis. Journal of Big Data, 6(1):1–27, 2019.

Derrick Timmerman, Swaroop Bennabhaktula, Enrique Alegre, and George Azzopardi. Video

camera identification from sensor pattern noise with a constrained convnet. arXiv preprint

arXiv:2012.06277, 2020.

Giounona Tzanidou, Iffat Zafar, and Eran A Edirisinghe. Carried object detection in videos

using color information. IEEE Transactions on information forensics and security,

(10):1620–1631, 2013.

Md Azher Uddin, Joolekha Bibi Joolee, and Kyung-Ah Sohn. Deep multi-modal network based

automated depression severity estimation. IEEE Transactions on Affective Computing,

Weisen Wang, Xirong Li, Zhiyan Xu, Weihong Yu, Jianchun Zhao, Dayong Ding, and Youxin

Chen. Learning two-stream cnn for multi-modal age-related macular degeneration categorization.

IEEE Journal of Biomedical and Health Informatics, 2022.

Yan Wang, Qindong Sun, Dongzhu Rong, Shancang Li, and Li Da Xu. Image source identification

using convolutional neural networks in iot environment. Wireless Communications

and Mobile Computing, 2021, 2021.

YuWang, Luca Bondi, Paolo Bestagini, Stefano Tubaro, David J Edward Delp, et al. A counterforensic

method for cnn-based camera model identification. In Proceedings of the IEEE

conference on computer vision and pattern recognition workshops, pages 28–35, 2017.

Deressa Wodajo and Solomon Atnafu. Deepfake video detection using convolutional vision

transformer. arXiv preprint arXiv:2102.11126, 2021.

SX Zhong-Qiu Zhao, P Zheng, and X Wu. Object detection with deep learning: a review.

Neural networks

DOI: https://doi.org/10.31449/inf.v47i3.4392

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me