Zhao Lei, Zhang Mingcheng, Ding Hongwei, Cui Xiaohui
Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China.
Entropy (Basel). 2021 Dec 17;23(12):1692. doi: 10.3390/e23121692.
Significant progress has been made in generating counterfeit images and videos. Forged videos generated by deepfaking have been widely spread and have caused severe societal impacts, which stir up public concern about automatic deepfake detection technology. Recently, many deepfake detection methods based on forged features have been proposed. Among the popular forged features, textural features are widely used. However, most of the current texture-based detection methods extract textures directly from RGB images, ignoring the mature spectral analysis methods. Therefore, this research proposes a deepfake detection network fusing RGB features and textural information extracted by neural networks and signal processing methods, namely, MFF-Net. Specifically, it consists of four key components: (1) a feature extraction module to further extract textural and frequency information using the Gabor convolution and residual attention blocks; (2) a texture enhancement module to zoom into the subtle textural features in shallow layers; (3) an attention module to force the classifier to focus on the forged part; (4) two instances of feature fusion to firstly fuse textural features from the shallow RGB branch and feature extraction module and then to fuse the textural features and semantic information. Moreover, we further introduce a new diversity loss to force the feature extraction module to learn features of different scales and directions. The experimental results show that MFF-Net has excellent generalization and has achieved state-of-the-art performance on various deepfake datasets.
在生成伪造图像和视频方面已经取得了重大进展。由深度伪造技术生成的伪造视频广泛传播并造成了严重的社会影响,这引发了公众对自动深度伪造检测技术的关注。最近,许多基于伪造特征的深度伪造检测方法被提出。在流行的伪造特征中,纹理特征被广泛使用。然而,当前大多数基于纹理的检测方法直接从RGB图像中提取纹理,忽略了成熟的光谱分析方法。因此,本研究提出了一种融合RGB特征和通过神经网络及信号处理方法提取的纹理信息的深度伪造检测网络,即MFF-Net。具体来说,它由四个关键组件组成:(1)一个特征提取模块,使用Gabor卷积和残差注意力块进一步提取纹理和频率信息;(2)一个纹理增强模块,放大浅层中的细微纹理特征;(3)一个注意力模块,迫使分类器专注于伪造部分;(4)两个特征融合实例,首先融合来自浅层RGB分支和特征提取模块的纹理特征,然后融合纹理特征和语义信息。此外,我们还引入了一种新的多样性损失,以迫使特征提取模块学习不同尺度和方向的特征。实验结果表明,MFF-Net具有出色的泛化能力,并且在各种深度伪造数据集上取得了领先的性能。