Laboratory of Functional MRI Technology (LOFT), Stevens Neuro Imaging and Informatics Institute, University of Southern California, Los Angeles, California, USA.
Laboratory of Neuro Imaging (LONI), Stevens Neuroimaging and Informatics Institute, University of Southern California, Los Angeles, California, USA.
Magn Reson Med. 2024 Feb;91(2):803-818. doi: 10.1002/mrm.29887. Epub 2023 Oct 17.
To present a Swin Transformer-based deep learning (DL) model (SwinIR) for denoising single-delay and multi-delay 3D arterial spin labeling (ASL) and compare its performance with convolutional neural network (CNN) and other Transformer-based methods.
SwinIR and CNN-based spatial denoising models were developed for single-delay ASL. The models were trained on 66 subjects (119 scans) and tested on 39 subjects (44 scans) from three different vendors. Spatiotemporal denoising models were developed using another dataset (6 subjects, 10 scans) of multi-delay ASL. A range of input conditions was tested for denoising single and multi-delay ASL, respectively. The performance was evaluated using similarity metrics, spatial SNR and quantification accuracy of cerebral blood flow (CBF), and arterial transit time (ATT).
SwinIR outperformed CNN and other Transformer-based networks, whereas pseudo-3D models performed better than 2D models for denoising single-delay ASL. The similarity metrics and image quality (SNR) improved with more slices in pseudo-3D models and further improved when using M0 as input, but introduced greater biases for CBF quantification. Pseudo-3D models with three slices achieved optimal balance between SNR and accuracy, which can be generalized to different vendors. For multi-delay ASL, spatiotemporal denoising models had better performance than spatial-only models with reduced biases in fitted CBF and ATT maps.
SwinIR provided better performance than CNN and other Transformer-based methods for denoising both single and multi-delay 3D ASL data. The proposed model offers flexibility to improve image quality and/or reduce scan time for 3D ASL to facilitate its clinical use.
提出一种基于 Swin Transformer 的深度学习(DL)模型(SwinIR),用于对单延迟和多延迟 3D 动脉自旋标记(ASL)进行去噪,并将其性能与卷积神经网络(CNN)和其他基于 Transformer 的方法进行比较。
为单延迟 ASL 开发了基于 SwinIR 和 CNN 的空间去噪模型。这些模型在来自三个不同供应商的 66 名受试者(119 次扫描)的数据集上进行训练,并在 39 名受试者(44 次扫描)的数据集上进行测试。使用另一个多延迟 ASL 数据集(6 名受试者,10 次扫描)开发了时空去噪模型。分别对单延迟和多延迟 ASL 的不同输入条件进行了测试。使用相似性度量、空间 SNR 和脑血流(CBF)和动脉渡越时间(ATT)的定量准确性来评估性能。
SwinIR 优于 CNN 和其他基于 Transformer 的网络,而对于单延迟 ASL 的去噪,伪 3D 模型的性能优于 2D 模型。相似性度量和图像质量(SNR)随着伪 3D 模型中切片数量的增加而提高,当使用 M0 作为输入时进一步提高,但会对 CBF 定量产生更大的偏差。具有三个切片的伪 3D 模型在 SNR 和准确性之间取得了最佳平衡,这可以推广到不同的供应商。对于多延迟 ASL,时空去噪模型的性能优于仅空间模型,拟合 CBF 和 ATT 图的偏差更小。
SwinIR 在对单延迟和多延迟 3D ASL 数据进行去噪方面的性能优于 CNN 和其他基于 Transformer 的方法。所提出的模型具有灵活性,可以提高图像质量和/或减少 3D ASL 的扫描时间,以促进其临床应用。