Huang Jiexing, Zhong Anni, Liu Yujian
Department of Radiation Oncology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.
Department of Information Technology, the Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.
Quant Imaging Med Surg. 2024 Dec 5;14(12):9290-9305. doi: 10.21037/qims-24-1145. Epub 2024 Nov 29.
Low-dose computed tomography (LDCT) reduces radiation exposure, but the introduced noise and artifacts impair its diagnostic accuracy. Convolutional neural networks (CNNs) are widely used for LDCT denoising, but they suffer from a limited receptive field. The use of a larger kernel size can enlarge the receptive field and boost model performance; however, the computational cost of the model greatly increases. We aimed to develop a LDCT denoising CNN with a large receptive field and lower computational complexity.
We developed a multi-scale perceptual modulation network (MSPMnet) incorporating a powerful multi-head decomposable convolution (MHDC). To address the high computational complexity of large kernel convolutions, we developed a novel MHDC module that can capture multi-scale features and efficiently expand the receptive field. The MHDC module couples maximum-pooling with three depth-wise convolutions of varying kernel sizes via a channel splitting mechanism, where, unlike conventional CNNs, the two large two-dimensional kernels are each decomposed into a set of cascaded orthogonal one-dimensional kernels to remain lightweight. Further, departing from prior methodologies that apply a uniform kernel size throughout the network, we introduced a receptive field-ramp mechanism that adeptly transitions from local to relatively long-range dependency modeling as the network depth increases, thereby achieving superior performance.
The proposed MSPMnet was evaluated on a Mayo Clinic data set with a conventional iterative algorithm, two CNN models, and two Transformer models used for comparison. Compared to the competing baseline methods, the MSPMnet exhibited better performance in both the visual and quantitative assessments. Visually, the MSPMnet preserved the structure, edges, and textures with excellent noise and artifact reduction, generating the denoised images closest to normal-dose computed tomography images. Quantitatively, the MSPMnet had the lowest root mean-square error (RMSE) (8.3094±1.9325) and the highest peak signal-to-noise ratio (PSNR) (33.8525±1.8213 dB), structural similarity index (SSIM) (0.9309±0.0272), and feature similarity index (FSIM) (0.9699±0.0113), demonstrating superior denoising performance.
The proposed MSPMnet excelled at LDCT denoising, effectively removing noise and artifacts while preserving edges. Compared to the state-of-the-art CNNs and Transformers, the proposed MSPMnet exhibited superior denoising performance both quantitatively and qualitatively.
低剂量计算机断层扫描(LDCT)可减少辐射暴露,但引入的噪声和伪影会损害其诊断准确性。卷积神经网络(CNN)被广泛用于LDCT去噪,但其感受野有限。使用更大的核尺寸可以扩大感受野并提高模型性能;然而,模型的计算成本会大幅增加。我们旨在开发一种具有大感受野和较低计算复杂度的LDCT去噪CNN。
我们开发了一种包含强大的多头可分解卷积(MHDC)的多尺度感知调制网络(MSPMnet)。为了解决大核卷积的高计算复杂度问题,我们开发了一种新颖的MHDC模块,该模块可以捕获多尺度特征并有效扩大感受野。MHDC模块通过通道分割机制将最大池化与三个不同核尺寸的深度卷积相结合,与传统CNN不同的是,两个大的二维核各自被分解为一组级联的正交一维核,以保持轻量级。此外,与在整个网络中应用统一核尺寸的先前方法不同,我们引入了一种感受野渐变机制,该机制随着网络深度的增加,巧妙地从局部依赖建模过渡到相对长距离的依赖建模,从而实现卓越的性能。
所提出的MSPMnet在梅奥诊所数据集上与传统迭代算法、两个CNN模型和两个Transformer模型进行了比较评估。与竞争的基线方法相比,MSPMnet在视觉和定量评估中均表现出更好的性能。在视觉上,MSPMnet保留了结构、边缘和纹理,出色地减少了噪声和伪影,生成了最接近正常剂量计算机断层扫描图像的去噪图像。在定量方面,MSPMnet的均方根误差(RMSE)最低(8.3094±1.9325),峰值信噪比(PSNR)最高(33.8525±1.8213 dB),结构相似性指数(SSIM)(0.9309±0.0272)和特征相似性指数(FSIM)(0.9699±0.0113),展示了卓越的去噪性能。
所提出的MSPMnet在LDCT去噪方面表现出色,能有效去除噪声和伪影,同时保留边缘。与最先进的CNN和Transformer相比,所提出的MSPMnet在定量和定性方面均表现出卓越的去噪性能。