Han Ruidong, Wang Xiaofeng, Bai Ningning, Wang Yihang, Hou Jianpeng, Xue Jianru
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):10005-10020. doi: 10.1109/TPAMI.2024.3432551. Epub 2024 Nov 6.
Modern image editing software enables anyone to alter the content of an image to deceive the public, which can pose a security hazard to personal privacy and public safety. The detection and localization of image tampering is becoming an urgent issue to be addressed. We have revealed that the tampered region exhibits homogenous differences (the changes in metadata organization form and organization structure of the image) from the real region after manipulations such as splicing, copy-move, and removal. Therefore, we propose a novel end-to-end network named HDF-Net to extract these homogeny difference features for precise localization of tampering artifacts. The HDF-Net is composed of RGB and SRM dual-stream networks, including three complementary modules, namely the suspicious tampering-artifact prominent (STP) module, the fine tampering-artifact salient (FTS) module, and the tampering-artifact edge refined (TER) module. We utilize the fully attentional block (FLA) to enhance the characterization ability of homogeny difference features extracted by each module and preserve the specifics of tampering artifacts. These modules are gradually merged according to the strategy of "coarse-fine-finer", which significantly improves the localization accuracy and edge refinement. Extensive experiments demonstrate that HDF-Net performs better than state-of-the-art tampering localization models on five benchmarks, achieving satisfactory generalization and robustness.
现代图像编辑软件使任何人都能够改变图像内容以欺骗公众,这可能对个人隐私和公共安全构成安全隐患。图像篡改的检测和定位正成为一个亟待解决的问题。我们发现,经过拼接、复制-移动和删除等操作后,篡改区域与真实区域呈现出同质差异(图像元数据组织形式和组织结构的变化)。因此,我们提出了一种名为HDF-Net的新型端到端网络,以提取这些同质差异特征,用于精确定位篡改痕迹。HDF-Net由RGB和SRM双流网络组成,包括三个互补模块,即可疑篡改痕迹突出(STP)模块、精细篡改痕迹显著(FTS)模块和篡改痕迹边缘细化(TER)模块。我们利用全注意力模块(FLA)来增强每个模块提取的同质差异特征的表征能力,并保留篡改痕迹的细节。这些模块按照“粗-细-更细”的策略逐步融合,显著提高了定位精度和边缘细化效果。大量实验表明,HDF-Net在五个基准测试中比现有最先进的篡改定位模型表现更好,具有令人满意的泛化能力和鲁棒性。