Pan Dichao, Shen Jianguo, Al-Huda Zaid, Al-Qaness Mohammed A A
College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua, 321004, China.
College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua, 321004, China; Zhejiang Institute of Optoelectronics, Jinhua, 321004, China.
Comput Biol Med. 2025 Mar;186:109662. doi: 10.1016/j.compbiomed.2025.109662. Epub 2025 Jan 14.
Accurate segmentation of brain tumors from MRI scans is a critical task in medical image analysis, yet it remains challenging due to the complex and variable nature of tumor shapes and sizes. Traditional convolutional neural networks (CNNs), while effective for local feature extraction, struggle to capture long-range dependencies crucial for 3D medical image analysis. To address these limitations, this paper presents VcaNet, a novel architecture that integrates a Vision Transformer (ViT) with a fusion channel and spatial attention module (CBAM), aimed at enhancing 3D brain tumor segmentation. The encoder of VcaNet employs a 3D enhanced convolution (ENCO) module to capture local volumetric features, while a Vision Transformer and multi-scale feature fusion module are incorporated in the bottleneck to capture global dependencies. Additionally, a CBAM is introduced in the decoder to further improve the integration of local and global features, enhancing segmentation accuracy. Extensive experiments on the two public BraTS Datasets demonstrate that VcaNet outperforms existing models, particularly in handling the complex spatial structures of brain tumors. This approach provides valuable insights for improving brain tumor segmentation, and its performance in 3D tasks surpasses that of 2D models, laying a foundation for future advancements in medical imaging.
从磁共振成像(MRI)扫描中准确分割脑肿瘤是医学图像分析中的一项关键任务,但由于肿瘤形状和大小的复杂多变性,这一任务仍然具有挑战性。传统的卷积神经网络(CNN)虽然在局部特征提取方面有效,但难以捕捉对三维医学图像分析至关重要的长距离依赖性。为了解决这些局限性,本文提出了VcaNet,这是一种新颖的架构,它将视觉Transformer(ViT)与融合通道和空间注意力模块(CBAM)相结合,旨在增强三维脑肿瘤分割。VcaNet的编码器采用三维增强卷积(ENCO)模块来捕捉局部体积特征,而在瓶颈部分引入了视觉Transformer和多尺度特征融合模块来捕捉全局依赖性。此外,在解码器中引入了CBAM以进一步改善局部和全局特征的整合,提高分割精度。在两个公开的BraTS数据集上进行的大量实验表明,VcaNet优于现有模型,特别是在处理脑肿瘤复杂的空间结构方面。这种方法为改善脑肿瘤分割提供了有价值的见解,并且其在三维任务中的性能超过了二维模型,为医学成像的未来发展奠定了基础。