Suppr超能文献

基于自融合方法的深度伪造视频的定位与检测

Localization and detection of deepfake videos based on self-blending method.

作者信息

Xu Junfeng, Liu Xintao, Lin Weiguo, Shang Wenqing, Wang Yuefeng

机构信息

Communication University of China, School of Computer & Cyber Sciences, Beijing, 100024, China.

出版信息

Sci Rep. 2025 Jan 31;15(1):3927. doi: 10.1038/s41598-025-88523-1.

Abstract

Deepfake technology, which encompasses various video manipulation techniques implemented through deep learning algorithms-such as face swapping and expression alteration-has advanced to generate fake videos that are increasingly difficult for human observers to detect, posing significant threats to societal security. Existing methods for detecting deepfake videos aim to identify such manipulated content to effectively prevent the spread of misinformation. However, these methods often suffer from limited generalization capabilities, exhibiting poor performance when detecting fake videos outside of their training datasets. Moreover, research on the precise localization of manipulated regions within deepfake videos is limited, primarily due to the absence of datasets with fine-grained annotations that specify which regions have been manipulated.To address these challenges, this paper proposes a novel spatial-based training method that does not require fake samples to detect spatial manipulations in deepfake videos. By employing a technique that combines multi-part local displacement deformation and fusion, we generate more diverse deepfake feature data, enhancing the detection accuracy of specific manipulation methods while producing mixed-region labels to guide manipulation localization. We utilize the Swin-Unet model for manipulation localization detection, incorporating classification loss functions, local difference loss functions, and manipulation localization loss functions to effectively improve the precision of localization and detection.Experimental results demonstrate that the proposed spatial-based training method without fake samples effectively simulates the features present in real datasets. Our method achieves satisfactory detection accuracy on datasets such as FF++, Celeb-DF, and DFDC, while accurately localizing the manipulated regions. These findings indicate the effectiveness of the proposed self-blending method and model in deepfake video detection and manipulation localization.

摘要

深度伪造技术涵盖了通过深度学习算法实现的各种视频操纵技术,如面部交换和表情改变,已经发展到能够生成人类观察者越来越难以检测到的虚假视频,对社会安全构成了重大威胁。现有的深度伪造视频检测方法旨在识别此类被操纵的内容,以有效防止错误信息的传播。然而,这些方法往往具有有限的泛化能力,在检测其训练数据集之外的虚假视频时表现不佳。此外,对深度伪造视频中被操纵区域的精确定位研究有限,主要是因为缺乏具有细粒度注释的数据集,这些注释可以指定哪些区域被操纵了。为了应对这些挑战,本文提出了一种新颖的基于空间的训练方法,该方法不需要虚假样本就能检测深度伪造视频中的空间操纵。通过采用一种结合多部分局部位移变形和融合的技术,我们生成了更多样化的深度伪造特征数据,提高了特定操纵方法的检测准确率,同时生成混合区域标签以指导操纵定位。我们利用Swin-Unet模型进行操纵定位检测,结合分类损失函数、局部差异损失函数和操纵定位损失函数,有效提高定位和检测的精度。实验结果表明,所提出的无虚假样本的基于空间的训练方法有效地模拟了真实数据集中存在的特征。我们的方法在FF++、Celeb-DF和DFDC等数据集上取得了令人满意的检测准确率,同时准确地定位了被操纵区域。这些发现表明了所提出的自混合方法和模型在深度伪造视频检测和操纵定位方面的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3ba/11785974/37a4ea029bd6/41598_2025_88523_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验