Niu Ke, Han Jiacheng, Cai Jiuyun
Beijing Information Science and Technology University, Computer School, Beijing, 100000, China.
Sci Rep. 2025 Jul 1;15(1):22236. doi: 10.1038/s41598-025-92010-y.
In medical image segmentation, traditional CNN-based models excel at extracting local features but have limitations in capturing global features. Conversely, Mamba, a novel network framework, effectively captures long-range feature dependencies and excels in processing linearly arranged image inputs, albeit at the cost of overlooking fine spatial relationships and local pixel interactions. This limitation highlights the need for hybrid approaches that combine the strengths of both architectures. To address this challenge, we propose CNN-Fusion-Mamba-based U-Net (CFM-UNet). The model integrates CNN-based Bottle2neck blocks for local feature extraction and Mamba-based visual state space blocks for global feature extraction. These parallel frameworks perform feature fusion through our designed SEF block, achieving complementary advantages. Experimental results demonstrate that CFM-UNet outperforms other advanced methods in segmenting medical image datasets, including liver organs, liver tumors, spine, and colon polyps, with notable generalization ability in liver organ segmentation. Our code is available at https://github.com/Jiacheng-Han/CFM-UNet .
在医学图像分割中,传统的基于卷积神经网络(CNN)的模型擅长提取局部特征,但在捕捉全局特征方面存在局限性。相反,新型网络框架曼巴(Mamba)能有效捕捉长距离特征依赖关系,并且在处理线性排列的图像输入方面表现出色,尽管代价是忽略了精细的空间关系和局部像素交互。这一局限性凸显了结合两种架构优势的混合方法的必要性。为应对这一挑战,我们提出了基于CNN-融合-曼巴的U型网络(CFM-UNet)。该模型集成了基于CNN的瓶颈模块用于局部特征提取,以及基于曼巴的视觉状态空间模块用于全局特征提取。这些并行框架通过我们设计的SEF模块进行特征融合,实现互补优势。实验结果表明,CFM-UNet在分割医学图像数据集(包括肝脏器官、肝肿瘤、脊柱和结肠息肉)方面优于其他先进方法,在肝脏器官分割方面具有显著的泛化能力。我们的代码可在https://github.com/Jiacheng-Han/CFM-UNet获取。