Ismail Aya, Elpeltagy Marwa, Zaki Mervat, ElDahshan Kamal A
Mathematics Department, Tanta University, Tanta, Al-Gharbia, Egypt.
Systems and Computers Department, Al-Azhar University, Cairo, Nasr City, Egypt.
PeerJ Comput Sci. 2021 Sep 21;7:e730. doi: 10.7717/peerj-cs.730. eCollection 2021.
Recently, the deepfake techniques for swapping faces have been spreading, allowing easy creation of hyper-realistic fake videos. Detecting the authenticity of a video has become increasingly critical because of the potential negative impact on the world. Here, a new project is introduced; You Only Look Once Convolution Recurrent Neural Networks (YOLO-CRNNs), to detect deepfake videos. The YOLO-Face detector detects face regions from each frame in the video, whereas a fine-tuned EfficientNet-B5 is used to extract the spatial features of these faces. These features are fed as a batch of input sequences into a Bidirectional Long Short-Term Memory (Bi-LSTM), to extract the temporal features. The new scheme is then evaluated on a new large-scale dataset; CelebDF-FaceForencics++ (c23), based on a combination of two popular datasets; FaceForencies++ (c23) and Celeb-DF. It achieves an Area Under the Receiver Operating Characteristic Curve (AUROC) 89.35% score, 89.38% accuracy, 83.15% recall, 85.55% precision, and 84.33% F1-measure for pasting data approach. The experimental analysis approves the superiority of the proposed method compared to the state-of-the-art methods.
最近,用于面部交换的深度伪造技术不断传播,使得超逼真的假视频能够轻松制作出来。由于可能对世界产生负面影响,检测视频的真实性变得越来越重要。在此,引入了一个新项目;你只看一次卷积循环神经网络(YOLO-CRNNs),用于检测深度伪造视频。YOLO-Face检测器从视频中的每一帧检测面部区域,而一个经过微调的EfficientNet-B5用于提取这些面部的空间特征。这些特征作为一批输入序列被输入到双向长短期记忆(Bi-LSTM)中,以提取时间特征。然后在一个新的大规模数据集;CelebDF-FaceForencics++(c23)上对新方案进行评估,该数据集基于两个流行数据集;FaceForencies++(c23)和Celeb-DF的组合。对于粘贴数据方法,它在接收者操作特征曲线下面积(AUROC)上达到了89.35%的分数,准确率为89.38%,召回率为83.15%,精确率为85.55%,F1值为84.33%。实验分析证实了所提出的方法相对于现有最先进方法的优越性。