Rahman Tahsin, Bilgin Ali, Cabrera Sergio D
Department of Electrical and Computer Engineering, The University of Texas at El Paso, El Paso, TX 79968, United States of America.
Departments of Biomedical Engineering, Electrical and Computer Engineering, and Medical Imaging, University of Arizona, Tucson, AZ 85721, United States of America.
Phys Med Biol. 2025 Mar 19;70(7). doi: 10.1088/1361-6560/adb933.
Deep neural networks have been shown to be very effective at artifact reduction tasks such as magnetic resonance imaging (MRI) reconstruction from undersampled k-space data. In recent years, attention-based vision transformer models have been shown to outperform purely convolutional models at a wide variety of tasks, including MRI reconstruction. Our objective is to investigate the use of different transformer architectures for multi-channel cascaded MRI reconstruction.In this work, we explore the effective use of cascades of small transformers in multi-channel undersampled MRI reconstruction. We introduce overlapped attention and compare it to hybrid attention in shifted-window (Swin) transformers. We also investigate the impact of the number of Swin transformer layers in each architecture. The proposed methods are compared to state-of-the-art MRI reconstruction methods for undersampled reconstruction on standard 3T and low-field (0.3T) T1-weighted MRI images at multiple acceleration rates.The models with overlapped attention achieve significantly higher or equivalent quantitative test metrics compared to state-of-the-art convolutional approaches. They also show more consistent reconstruction performance across different acceleration rates compared to their hybrid attention counterparts. We have also shown that transformer architectures with fewer layers can be as effective as those with more layers when used in cascaded MRI reconstruction problems.The feasibility and effectiveness of cascades of small transformers with overlapped attention for MRI reconstruction is demonstrated without incorporating pre-training of the transformer on ImageNet or other large-scale datasets.
深度神经网络已被证明在诸如从欠采样k空间数据进行磁共振成像(MRI)重建等伪影减少任务中非常有效。近年来,基于注意力的视觉Transformer模型在包括MRI重建在内的各种任务中已被证明优于纯卷积模型。我们的目标是研究不同Transformer架构在多通道级联MRI重建中的应用。在这项工作中,我们探索了小型Transformer级联在多通道欠采样MRI重建中的有效应用。我们引入了重叠注意力,并将其与移位窗口(Swin)Transformer中的混合注意力进行比较。我们还研究了每种架构中Swin Transformer层数的影响。将所提出的方法与用于标准3T和低场(0.3T)T1加权MRI图像的多加速率欠采样重建的最新MRI重建方法进行比较。与最新的卷积方法相比,具有重叠注意力的模型实现了显著更高或相当的定量测试指标。与混合注意力对应模型相比,它们在不同加速率下也表现出更一致的重建性能。我们还表明,在级联MRI重建问题中使用时,层数较少的Transformer架构可以与层数较多的架构一样有效。在不将Transformer在ImageNet或其他大规模数据集上进行预训练的情况下,证明了具有重叠注意力的小型Transformer级联用于MRI重建的可行性和有效性。