Department of Nuclear Medicine, Henri Becquerel Center, 76038, Rouen, France.
LITIS-QuantIF Laboratory, University of Rouen Normandy, 76183, Rouen, France.
Int J Comput Assist Radiol Surg. 2024 Feb;19(2):273-281. doi: 10.1007/s11548-023-03024-8. Epub 2023 Oct 5.
Fully convolutional neural networks architectures have proven to be useful for brain tumor segmentation tasks. However, their performance in learning long-range dependencies is limited to their localized receptive fields. On the other hand, vision transformers (ViTs), essentially based on a multi-head self-attention mechanism, which generates attention maps to aggregate spatial information dynamically, have outperformed convolutional neural networks (CNNs). Inspired by the recent success of ViT models for the medical images segmentation, we propose in this paper a new network based on Swin transformer for semantic brain tumor segmentation.
The proposed method for brain tumor segmentation combines Transformer and CNN modules as an encoder-decoder structure. The encoder incorporates ELSA transformer blocks used to enhance local detailed feature extraction. The extracted feature representations are fed to the decoder part via skip connections. The encoder part includes channel squeeze and spatial excitation blocks, which enable the extracted features to be more informative both spatially and channel-wise.
The method is evaluated on the public BraTS 2021 datasets containing 1251 cases of brain images, each with four 3D MRI modalities. Our proposed approach achieved excellent segmentation results with an average Dice score of 89.77% and an average Hausdorff distance of 8.90 mm.
We developed an automated framework for brain tumor segmentation using Swin transformer and enhanced local self-attention. Experimental results show that our method outperforms state-of-th-art 3D algorithms for brain tumor segmentation.
全卷积神经网络架构已被证明可用于脑肿瘤分割任务。然而,它们在学习长程依赖关系方面的性能受到其局部感受野的限制。另一方面,视觉转换器(ViTs),本质上基于多头自注意力机制,该机制生成注意力图以动态聚合空间信息,其性能优于卷积神经网络(CNNs)。受 ViT 模型在医学图像分割方面取得的最新成功的启发,我们在本文中提出了一种基于 Swin 转换器的新网络,用于语义脑肿瘤分割。
用于脑肿瘤分割的建议方法将 Transformer 和 CNN 模块组合为编码器-解码器结构。编码器采用 ELSA 转换器块,用于增强局部详细特征提取。提取的特征表示通过跳过连接馈送到解码器部分。编码器部分包括通道压缩和空间激励块,可使提取的特征在空间和通道上更具信息量。
该方法在包含 1251 例脑图像的公共 BraTS 2021 数据集上进行了评估,每个图像具有四种 3D MRI 模态。我们提出的方法在分割方面取得了出色的结果,平均 Dice 得分为 89.77%,平均 Hausdorff 距离为 8.90mm。
我们使用 Swin 转换器和增强的局部自注意力开发了一种用于脑肿瘤分割的自动化框架。实验结果表明,我们的方法优于脑肿瘤分割的最新 3D 算法。