基于SE增强融合MBConv和优化CNN头的多分辨率迁移学习用于篡改图像分类

Multi-resolution transfer learning for tampered image classification using SE-enhanced fused-MBConv and optimized CNN heads.

作者信息

Korsipati Jithin Reddy, Yanamala Rama Muni Reddy, Pallakonda Archana, Raj Rayappa David Amar, Prakasha K Krishna

机构信息

Amrita School of Artificial Intelligence, Amrita Vishwa Vidyapeetham, Coimbatore, Tamil Nadu, 641112, India.

Department of Electronics and Communication Engineering, Indian Institute of Information Technology Design and Manufacturing (IIITD&M) Kancheepuram, Chennai, 600127, India.

出版信息

Sci Rep. 2025 Sep 24;15(1):32717. doi: 10.1038/s41598-025-17799-0.

DOI:10.1038/s41598-025-17799-0

PMID:40993163

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12460640/

Abstract

The widespread use of digital image tampering has created a strong need for accurate and generalizable detection systems, especially in domains like forensics, journalism, and cybersecurity. Traditional handcrafted methods often fail to capture subtle manipulation artifacts, and many deep learning approaches lack generalization across diverse image sources and manipulation techniques. To address these limitations, we propose a tampered image classification model based on transfer learning using EfficientNetV2B0. This backbone is combined with a lightweight, regularized CNN classification head and optimized using Focal Loss to address class imbalance. The architecture integrates compound scaling, fused MBConv layers, and squeeze-and-excitation (SE) attention to improve feature representation and robustness. We evaluate the model on four benchmark datasets-CASIA v1, Columbia, MICC-F2000, and Defacto (Splicing)-and achieve exceptional performance, with AUC scores up to 1.0000 and F1-scores up to 0.9997. Comparisons with 42 state-of-the-art models, including IML-ViT, MVSS-Net++, ConvNeXtFF, and DRRU-Net, show our method consistently outperforms existing approaches in accuracy, precision, recall, and generalization, particularly on high-resolution and compressed images. These results demonstrate the practical effectiveness and forensic reliability of the proposed system.

摘要

数字图像篡改的广泛使用使得对准确且通用的检测系统产生了强烈需求，尤其是在法医、新闻和网络安全等领域。传统的手工制作方法往往无法捕捉到细微的篡改痕迹，并且许多深度学习方法在不同图像来源和篡改技术之间缺乏通用性。为了解决这些局限性，我们提出了一种基于迁移学习的篡改图像分类模型，该模型使用EfficientNetV2B0。这个主干网络与一个轻量级、正则化的卷积神经网络分类头相结合，并使用焦点损失进行优化以解决类别不平衡问题。该架构集成了复合缩放、融合的MBConv层和挤压激励（SE）注意力机制，以提高特征表示能力和鲁棒性。我们在四个基准数据集——CASIA v1、哥伦比亚、MICC-F2000和Defacto（拼接）上对该模型进行了评估，并取得了优异的性能，AUC分数高达1.0000，F1分数高达0.9997。与42种先进模型（包括IML-ViT、MVSS-Net++、ConvNeXtFF和DRRU-Net）的比较表明，我们的方法在准确性、精确率、召回率和通用性方面始终优于现有方法，特别是在高分辨率和压缩图像上。这些结果证明了所提出系统的实际有效性和法医可靠性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于SE增强融合MBConv和优化CNN头的多分辨率迁移学习用于篡改图像分类

Multi-resolution transfer learning for tampered image classification using SE-enhanced fused-MBConv and optimized CNN heads.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

基于SE增强融合MBConv和优化CNN头的多分辨率迁移学习用于篡改图像分类

Multi-resolution transfer learning for tampered image classification using SE-enhanced fused-MBConv and optimized CNN heads.

作者信息

机构信息

出版信息

相似文献

本文引用的文献