一种使用具有局部和全局特征融合的深度变压器模型的乳腺X光图像增强去噪系统。

An enhanced denoising system for mammogram images using deep transformer model with fusion of local and global features.

作者信息

Singh A Robert, Athisayamani Suganya, Karim Faten Khalid, Ibrahim Ahmed Zohair, Alshetewi Sameer, Mostafa Samih M

机构信息

Department of Computational Intelligence, SRM Institute of Science and Technology, Chennai, Tamil Nadu, 603203, India.

School of Computing, Sastra Deemed to be University, Thanjavur, Tamil Nadu, 613401, India.

出版信息

Sci Rep. 2025 Feb 24;15(1):6562. doi: 10.1038/s41598-025-89451-w.

DOI:10.1038/s41598-025-89451-w

PMID:39994276

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11850631/

Abstract

Image denoising is a critical problem in low-level computer vision, where the aim is to reconstruct a clean, noise-free image from a noisy input, such as a mammogram image. In recent years, deep learning, particularly convolutional neural networks (CNNs), has shown great success in various image processing tasks, including denoising, image compression, and enhancement. While CNN-based approaches dominate, Transformer models have recently gained popularity for computer vision tasks. However, there have been fewer applications of Transformer-based models to low-level vision problems like image denoising. In this study, a novel denoising network architecture called DeepTFormer is proposed, which leverages Transformer models for the task. The DeepTFormer architecture consists of three main components: a preprocessing module, a local-global feature extraction module, and a reconstruction module. The local-global feature extraction module is the core of DeepTFormer, comprising several groups of ITransformer layers. Each group includes a series of Transformer layers, convolutional layers, and residual connections. These groups are tightly coupled with residual connections, which allow the model to capture both local and global information from the noisy images effectively. The design of these groups ensures that the model can utilize both local features for fine details and global features for larger context, leading to more accurate denoising. To validate the performance of the DeepTFormer model, extensive experiments were conducted using both synthetic and real noise data. Objective and subjective evaluations demonstrated that DeepTFormer outperforms leading denoising methods. The model achieved impressive results, surpassing state-of-the-art techniques in terms of key metrics like PSNR, FSIM, EPI, and SSIM, with values of 0.41, 0.93, 0.96, and 0.94, respectively. These results demonstrate that DeepTFormer is a highly effective solution for image denoising, combining the power of Transformer architecture with convolutional layers to enhance both local and global feature extraction.

摘要

图像去噪是低级计算机视觉中的一个关键问题，其目标是从有噪声的输入（如乳腺X光图像）中重建出清晰、无噪声的图像。近年来，深度学习，特别是卷积神经网络（CNN），在包括去噪、图像压缩和增强在内的各种图像处理任务中取得了巨大成功。虽然基于CNN的方法占主导地位，但Transformer模型最近在计算机视觉任务中受到了广泛关注。然而，基于Transformer的模型在图像去噪等低级视觉问题上的应用较少。在本研究中，提出了一种名为DeepTFormer的新型去噪网络架构，该架构利用Transformer模型来完成这项任务。DeepTFormer架构由三个主要部分组成：预处理模块、局部-全局特征提取模块和重建模块。局部-全局特征提取模块是DeepTFormer的核心，由几组ITransformer层组成。每组包括一系列Transformer层、卷积层和残差连接。这些组通过残差连接紧密耦合，使模型能够有效地从噪声图像中捕捉局部和全局信息。这些组的设计确保模型能够利用局部特征获取精细细节，利用全局特征获取更大的上下文信息，从而实现更准确的去噪。为了验证DeepTFormer模型的性能，使用合成噪声数据和真实噪声数据进行了广泛的实验。客观和主观评估表明，DeepTFormer优于领先的去噪方法。该模型取得了令人印象深刻的结果，在PSNR、FSIM、EPI和SSIM等关键指标上超过了现有技术，其值分别为0.41、0.93、0.96和0.94。这些结果表明，DeepTFormer是一种高效的图像去噪解决方案，它将Transformer架构的强大功能与卷积层相结合，增强了局部和全局特征提取能力。