Li Li, Zhang Kejia, Lu Jianfeng, Zhang Shanqing
School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang, China.
Shangyu Institute of Science and Engineering, Hangzhou Dianzi University, Shaoxing, Zhejiang, China.
PeerJ Comput Sci. 2025 Apr 8;11:e2775. doi: 10.7717/peerj-cs.2775. eCollection 2025.
The majority of deep learning methods for detecting image forgery fail to accurately detect and localize the tampering operations. Furthermore, they only support a single image tampering type. Our method introduces three key innovations: (1) A spatial perception module that combines the spatial rich model (SRM) with constrained convolution, enabling focused detection of tampering traces while suppressing interference from image content; (2) A hierarchical feature learning architecture that integrates Swin Transformer with UperNet for effective multi-scale tampering pattern recognition; and (3) A comprehensive optimization strategy including auxiliary supervision, self-supervised learning, and hard example mining, which significantly improves model convergence and detection accuracy. Comprehensive experiments are performed on two established datasets; namely MixTamper and DocTamper with 19,600 and 170,000 images, respectively. The experimental findings demonstrate that the proposed model enhances the IoU index by 13% compared to the leading algorithms. Additionally, it can accurately detect multiple tampering types from a single image.
大多数用于检测图像伪造的深度学习方法无法准确检测和定位篡改操作。此外,它们仅支持单一的图像篡改类型。我们的方法引入了三项关键创新:(1)一个空间感知模块,将空间丰富模型(SRM)与约束卷积相结合,能够在抑制图像内容干扰的同时,聚焦检测篡改痕迹;(2)一种层次特征学习架构,将Swin Transformer与UperNet集成,用于有效的多尺度篡改模式识别;(3)一种综合优化策略,包括辅助监督、自监督学习和难例挖掘,显著提高了模型的收敛性和检测准确性。在两个已建立的数据集上进行了全面实验;分别是包含19600张图像的MixTamper和包含170000张图像的DocTamper。实验结果表明,与领先算法相比,所提出的模型将交并比(IoU)指标提高了13%。此外,它能够从单张图像中准确检测多种篡改类型。