IEEE J Biomed Health Inform. 2024 Nov;28(11):6815-6827. doi: 10.1109/JBHI.2024.3454979. Epub 2024 Nov 6.
Low-dose computed tomography (LDCT) offers reduced X-ray radiation exposure but at the cost of compromised image quality, characterized by increased noise and artifacts. Recently, transformer models emerged as a promising avenue to enhance LDCT image quality. However, the success of such models relies on a large amount of paired noisy and clean images, which are often scarce in clinical settings. In computer vision and natural language processing, masked autoencoders (MAE) have been recognized as a powerful self-pretraining method for transformers, due to their exceptional capability to extract representative features. However, the original pretraining and fine-tuning design fails to work in low-level vision tasks like denoising. In response to this challenge, we redesign the classical encoder-decoder learning model and facilitate a simple yet effective streamlined low-level vision MAE, referred to as LoMAE, tailored to address the LDCT denoising problem. Moreover, we introduce an MAE-GradCAM method to shed light on the latent learning mechanisms of the MAE/LoMAE. Additionally, we explore the LoMAE's robustness and generability across a variety of noise levels. Experimental findings show that the proposed LoMAE enhances the denoising capabilities of the transformer and substantially reduce their dependency on high-quality, ground-truth data. It also demonstrates remarkable robustness and generalizability over a spectrum of noise levels. In summary, the proposed LoMAE provides promising solutions to the major issues in LDCT including interpretability, ground truth data dependency, and model robustness/generalizability.
低剂量计算机断层扫描(LDCT)提供了较低的 X 射线辐射暴露,但代价是图像质量受损,其特征是噪声和伪影增加。最近,变压器模型作为一种有前途的方法出现,以提高 LDCT 图像质量。然而,这种模型的成功依赖于大量配对的嘈杂和干净的图像,而这些图像在临床环境中往往很少。在计算机视觉和自然语言处理中,掩蔽自动编码器(MAE)已被认为是变压器的一种强大的自我预训练方法,因为它们具有提取代表性特征的出色能力。然而,原始的预训练和微调设计在像去噪这样的低级视觉任务中无法工作。针对这一挑战,我们重新设计了经典的编解码器学习模型,并促进了一种简单而有效的简化的低级视觉 MAE,称为 LoMAE,专门用于解决 LDCT 去噪问题。此外,我们引入了 MAE-GradCAM 方法,以揭示 MAE/LoMAE 的潜在学习机制。此外,我们还探讨了 LoMAE 在各种噪声水平下的鲁棒性和通用性。实验结果表明,所提出的 LoMAE 增强了变压器的去噪能力,并大大降低了对高质量、真实数据的依赖。它还在一系列噪声水平上表现出显著的鲁棒性和通用性。总之,所提出的 LoMAE 为 LDCT 中的主要问题提供了有前途的解决方案,包括可解释性、真实数据依赖性和模型鲁棒性/通用性。