Suppr超能文献

一种用于低剂量CT去噪的新型视觉状态空间模型。

A new visual State Space Model for low-dose CT denoising.

作者信息

Huang Jiexing, Zhong Anni, Wei Yajing

机构信息

Department of Radiation Oncology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.

Department of Digital Hospital Construction, the Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.

出版信息

Med Phys. 2024 Dec;51(12):8851-8864. doi: 10.1002/mp.17387. Epub 2024 Sep 4.

Abstract

BACKGROUND

Low-dose computed tomography (LDCT) can mitigate potential health risks to the public. However, the severe noise and artifacts in LDCT images can impede subsequent clinical diagnosis and analysis. Convolutional neural networks (CNNs) and Transformers stand out as the two most popular backbones in LDCT denoising. Nonetheless, CNNs suffer from a lack of long-range modeling capabilities, while Transformers are hindered by high computational complexity.

PURPOSE

In this study, our main goal is to develop a simple and efficient model that can both focus on local spatial context and model long-range dependencies with linear computational complexity for LDCT denoising.

METHODS

In this study, we make the first attempt to apply the State Space Model to LDCT denoising and propose a novel LDCT denoising model named Visual Mamba Encoder-Decoder Network (ViMEDnet). To efficiently and effectively capture both the local and global features, we propose the Mixed State Space Module (MSSM), where the depth-wise convolution, max-pooling, and 2D Selective Scan Module (2DSSM) are coupled together through a partial channel splitting mechanism. 2DSSM is capable of capturing global information with linear computational complexity, while convolution and max-pooling can effectively learn local signals to facilitate detail restoration. Furthermore, the network uses a weighted gradient-sensitive hybrid loss function to facilitate the preservation of image details, improving the overall denoising performance.

RESULTS

The performance of our proposed ViMEDnet is compared to five state-of-the-art LDCT denoising methods, including an iterative algorithm, two CNN-based methods, and two Transformer-based methods. The comparative experimental results demonstrate that the proposed ViMEDnet can achieve better visual quality and quantitative assessment outcomes. In visual evaluation, ViMEDnet effectively removes noise and artifacts, while exhibiting superior performance in restoring fine structures and low-contrast structural edges, resulting in minimal deviation of denoised images from NDCT. In quantitative assessment, ViMEDnet obtains the lowest RMSE and the highest PSNR, SSIM, and FSIM scores, further substantiating the superiority of ViMEDnet.

CONCLUSIONS

The proposed ViMEDnet possesses excellent LDCT denoising performance and provides a new alternative to LDCT denoising models beyond the existing CNN and Transformer options.

摘要

背景

低剂量计算机断层扫描(LDCT)可以降低对公众潜在的健康风险。然而,LDCT图像中严重的噪声和伪影会妨碍后续的临床诊断和分析。卷积神经网络(CNN)和Transformer是LDCT去噪中最受欢迎的两种骨干网络。尽管如此,CNN缺乏长距离建模能力,而Transformer则受到高计算复杂度的阻碍。

目的

在本研究中,我们的主要目标是开发一种简单高效的模型,该模型能够在线性计算复杂度下,既关注局部空间上下文,又能对长距离依赖性进行建模,用于LDCT去噪。

方法

在本研究中,我们首次尝试将状态空间模型应用于LDCT去噪,并提出了一种名为视觉曼巴编码器 - 解码器网络(ViMEDnet)的新型LDCT去噪模型。为了高效且有效地捕捉局部和全局特征,我们提出了混合状态空间模块(MSSM),其中深度卷积、最大池化和二维选择性扫描模块(2DSSM)通过部分通道分割机制耦合在一起。2DSSM能够以线性计算复杂度捕捉全局信息,而卷积和最大池化可以有效地学习局部信号以促进细节恢复。此外,该网络使用加权梯度敏感混合损失函数来促进图像细节的保留,提高整体去噪性能。

结果

我们提出的ViMEDnet的性能与五种先进的LDCT去噪方法进行了比较,包括一种迭代算法、两种基于CNN的方法和两种基于Transformer的方法。对比实验结果表明,所提出的ViMEDnet能够实现更好的视觉质量和定量评估结果。在视觉评估中,ViMEDnet有效地去除了噪声和伪影,同时在恢复精细结构和低对比度结构边缘方面表现出卓越性能,使得去噪后的图像与非增强CT(NDCT)的偏差最小。在定量评估中,ViMEDnet获得了最低的均方根误差(RMSE)以及最高的峰值信噪比(PSNR)、结构相似性指数(SSIM)和特征相似性指数(FSIM)分数,进一步证实了ViMEDnet的优越性。

结论

所提出的ViMEDnet具有出色的LDCT去噪性能,为LDCT去噪模型提供了一种超越现有CNN和Transformer选项的新选择。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验