Jiang Yun, Zhang Yuan, Lin Xin, Dong Jinkun, Cheng Tongtong, Liang Jing
College of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China.
Brain Sci. 2022 Jun 17;12(6):797. doi: 10.3390/brainsci12060797.
Brain tumor semantic segmentation is a critical medical image processing work, which aids clinicians in diagnosing patients and determining the extent of lesions. Convolutional neural networks (CNNs) have demonstrated exceptional performance in computer vision tasks in recent years. For 3D medical image tasks, deep convolutional neural networks based on an encoder-decoder structure and skip-connection have been frequently used. However, CNNs have the drawback of being unable to learn global and remote semantic information well. On the other hand, the transformer has recently found success in natural language processing and computer vision as a result of its usage of a self-attention mechanism for global information modeling. For demanding prediction tasks, such as 3D medical picture segmentation, local and global characteristics are critical. We propose SwinBTS, a new 3D medical picture segmentation approach, which combines a transformer, convolutional neural network, and encoder-decoder structure to define the 3D brain tumor semantic segmentation job as a sequence-to-sequence prediction challenge in this research. To extract contextual data, the 3D Swin Transformer is utilized as the network's encoder and decoder, and convolutional operations are employed for upsampling and downsampling. Finally, we achieve segmentation results using an improved Transformer module that we built for increasing detail feature extraction. Extensive experimental results on the BraTS 2019, BraTS 2020, and BraTS 2021 datasets reveal that SwinBTS outperforms state-of-the-art 3D algorithms for brain tumor segmentation on 3D MRI scanned images.
脑肿瘤语义分割是一项关键的医学图像处理工作,它有助于临床医生诊断患者并确定病变范围。近年来,卷积神经网络(CNN)在计算机视觉任务中表现出卓越的性能。对于三维医学图像任务,基于编码器-解码器结构和跳跃连接的深度卷积神经网络被频繁使用。然而,CNN存在无法很好地学习全局和远程语义信息的缺点。另一方面,由于变压器(Transformer)使用自注意力机制进行全局信息建模,它最近在自然语言处理和计算机视觉领域取得了成功。对于诸如三维医学图像分割等要求苛刻的预测任务,局部和全局特征至关重要。在本研究中,我们提出了SwinBTS,一种新的三维医学图像分割方法,它结合了变压器、卷积神经网络和编码器-解码器结构,将三维脑肿瘤语义分割任务定义为一个序列到序列的预测挑战。为了提取上下文数据,三维Swin Transformer被用作网络的编码器和解码器,卷积操作被用于上采样和下采样。最后,我们使用为增加细节特征提取而构建的改进型Transformer模块获得分割结果。在BraTS 2019、BraTS 2020和BraTS 2021数据集上的大量实验结果表明,SwinBTS在三维MRI扫描图像上的脑肿瘤分割性能优于当前最先进的三维算法。