SMF-net：用于医学CT图像中精确胰腺肿瘤分割的语义引导多模态融合网络

SMF-net: semantic-guided multimodal fusion network for precise pancreatic tumor segmentation in medical CT image.

作者信息

Zhou Wenyi, Shi Ziyang, Xie Bin, Li Fang, Yin Jiehao, Zhang Yongzhong, Hu Linan, Li Lin, Yan Yongming, Wei Xiajun, Hu Zhen, Luo Zhengmao, Peng Wanxiang, Xie Xiaochun, Long Xiaoli

机构信息

School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha, China.

Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou, China.

出版信息

Front Oncol. 2025 Jul 18;15:1622426. doi: 10.3389/fonc.2025.1622426. eCollection 2025.

DOI:10.3389/fonc.2025.1622426

PMID:40756121

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12313478/

Abstract

BACKGROUND

Accurate and automated segmentation of pancreatic tumors from CT images via deep learning is essential for the clinical diagnosis of pancreatic cancer. However, two key challenges persist: (a) complex phenotypic variations in pancreatic morphology cause segmentation models to focus predominantly on healthy tissue over tumors, compromising tumor feature extraction and segmentation accuracy; (b) existing methods often struggle to retain fine-grained local features, leading to performance degradation in pancreas-tumor segmentation.

METHODS

To overcome these limitations, we propose SMF-Net (Semantic-Guided Multimodal Fusion Network), a novel multimodal medical image segmentation framework integrating a CNN-Transformer hybrid encoder. The framework incorporates AMBERT, a progressive feature extraction module, and the Multimodal Token Transformer (MTT) to fuse visual and semantic features for enhanced tumor localization. Additionally, The Multimodal Enhanced Attention Module (MEAM) further improves the retention of local discriminative features. To address multimodal data scarcity, we adopt a semi-supervised learning paradigm based on a Dual-Adversarial-Student Network (DAS-Net). Furthermore, in collaboration with Zhuzhou Central Hospital, we constructed the Multimodal Pancreatic Tumor Dataset (MPTD).

RESULTS

The experimental results on the MPTD indicate that our model achieved Dice scores of 79.25% and 64.21% for pancreas and tumor segmentation, respectively, showing improvements of 2.24% and 4.18% over the original model. Furthermore, the model outperformed existing state-of-the-art methods on the QaTa-COVID-19 and MosMedData lung infection segmentation datasets in terms of average Dice scores, demonstrating its strong generalization ability.

CONCLUSION

The experimental results demonstrate that SMF-Net delivers accurate segmentation of both pancreatic, tumor and pulmonary regions, highlighting its strong potential for real-world clinical applications.

摘要

背景

通过深度学习从CT图像中准确、自动地分割胰腺肿瘤对于胰腺癌的临床诊断至关重要。然而，两个关键挑战依然存在：（a）胰腺形态的复杂表型变异导致分割模型主要关注健康组织而非肿瘤，从而损害肿瘤特征提取和分割准确性；（b）现有方法往往难以保留细粒度局部特征，导致胰腺肿瘤分割性能下降。

方法

为克服这些局限性，我们提出了SMF-Net（语义引导多模态融合网络），这是一种集成了CNN-Transformer混合编码器的新型多模态医学图像分割框架。该框架包含渐进式特征提取模块AMBERT和多模态令牌变换器（MTT），以融合视觉和语义特征来增强肿瘤定位。此外，多模态增强注意力模块（MEAM）进一步提高了局部判别特征的保留率。为解决多模态数据稀缺问题，我们采用了基于双对抗学生网络（DAS-Net）的半监督学习范式。此外，我们与株洲市中心医院合作构建了多模态胰腺肿瘤数据集（MPTD）。

结果

在MPTD上的实验结果表明，我们的模型在胰腺和肿瘤分割方面的Dice分数分别达到了79.25%和64.21%，比原始模型分别提高了2.24%和4.18%。此外，在QaTa-COVID-19和MosMedData肺部感染分割数据集上，该模型在平均Dice分数方面优于现有最先进的方法，证明了其强大的泛化能力。

结论

实验结果表明，SMF-Net能够准确分割胰腺、肿瘤和肺部区域，凸显了其在实际临床应用中的强大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f003/12313478/ea16bd27cca5/fonc-15-1622426-g001.jpg

相似文献

SMF-net: semantic-guided multimodal fusion network for precise pancreatic tumor segmentation in medical CT image.SMF-net：用于医学CT图像中精确胰腺肿瘤分割的语义引导多模态融合网络

Front Oncol. 2025 Jul 18;15:1622426. doi: 10.3389/fonc.2025.1622426. eCollection 2025.

Multi-level channel-spatial attention and light-weight scale-fusion network (MCSLF-Net): multi-level channel-spatial attention and light-weight scale-fusion transformer for 3D brain tumor segmentation.多级通道空间注意力与轻量级尺度融合网络（MCSLF-Net）：用于3D脑肿瘤分割的多级通道空间注意力与轻量级尺度融合变换器

Quant Imaging Med Surg. 2025 Jul 1;15(7):6301-6325. doi: 10.21037/qims-2025-354. Epub 2025 Jun 30.

Influence of early through late fusion on pancreas segmentation from imperfectly registered multimodal magnetic resonance imaging.早期至晚期融合对来自配准不完善的多模态磁共振成像的胰腺分割的影响。

J Med Imaging (Bellingham). 2025 Mar;12(2):024008. doi: 10.1117/1.JMI.12.2.024008. Epub 2025 Apr 26.

Structural semantic-guided MR synthesis from PET images via a dual cross-attention mechanism.通过双交叉注意力机制从PET图像进行结构语义引导的MR合成。

Med Phys. 2025 Jul;52(7):e17957. doi: 10.1002/mp.17957.

VMDU-net: a dual encoder multi-scale fusion network for polyp segmentation with Vision Mamba and Cross-Shape Transformer integration.VMDU-net：一种用于息肉分割的双编码器多尺度融合网络，集成了视觉曼巴和十字形变换器

Front Artif Intell. 2025 Jun 18;8:1557508. doi: 10.3389/frai.2025.1557508. eCollection 2025.

LGF-Net: A multi-scale feature fusion network for thyroid nodule ultrasound image classification.LGF-Net：一种用于甲状腺结节超声图像分类的多尺度特征融合网络。

J Appl Clin Med Phys. 2025 Aug;26(8):e70149. doi: 10.1002/acm2.70149.

3D-WDA-PMorph: Efficient 3D MRI/TRUS Prostate Registration using Transformer-CNN Network and Wavelet-3D-Depthwise-Attention.3D-WDA-PMorph：使用Transformer-CNN网络和小波3D深度注意力的高效3D MRI/TRUS前列腺配准

J Imaging Inform Med. 2025 Jul 25. doi: 10.1007/s10278-025-01615-2.

Diffusion semantic segmentation model: A generative model for medical image segmentation based on joint distribution.扩散语义分割模型：一种基于联合分布的医学图像分割生成模型。

Med Phys. 2025 Jul;52(7):e17928. doi: 10.1002/mp.17928. Epub 2025 Jun 8.

A medical image classification method based on self-regularized adversarial learning.基于自正则化对抗学习的医学图像分类方法。

Med Phys. 2024 Nov;51(11):8232-8246. doi: 10.1002/mp.17320. Epub 2024 Jul 30.

A CNN-transformer-based hybrid U-shape model with long-range relay for esophagus 3D CT image gross tumor volume segmentation.一种基于卷积神经网络（CNN）-变压器的混合U型模型，用于食管三维CT图像大体肿瘤体积分割的远程中继。

Med Phys. 2025 Jul;52(7):e17818. doi: 10.1002/mp.17818. Epub 2025 Apr 14.

本文引用的文献

Text-Assisted Vision Model for Medical Image Segmentation.用于医学图像分割的文本辅助视觉模型。

IEEE J Biomed Health Inform. 2025 May 13;PP. doi: 10.1109/JBHI.2025.3569491.

A Multimodal Approach for Detection and Assessment of Depression Using Text, Audio and Video.一种使用文本、音频和视频检测与评估抑郁症的多模态方法。

Phenomics. 2024 May 3;4(3):234-249. doi: 10.1007/s43657-023-00152-8. eCollection 2024 Jun.

ConTEXTual Net: A Multimodal Vision-Language Model for Segmentation of Pneumothorax.语境网络：一种用于气胸分割的多模态视觉语言模型。

J Imaging Inform Med. 2024 Aug;37(4):1652-1663. doi: 10.1007/s10278-024-01051-8. Epub 2024 Mar 14.

Mutual learning with reliable pseudo label for semi-supervised medical image segmentation.用于半监督医学图像分割的基于可靠伪标签的相互学习

Med Image Anal. 2024 May;94:103111. doi: 10.1016/j.media.2024.103111. Epub 2024 Feb 21.

A survey on cancer detection via convolutional neural networks: Current challenges and future directions.基于卷积神经网络的癌症检测研究综述：当前挑战与未来方向

Neural Netw. 2024 Jan;169:637-659. doi: 10.1016/j.neunet.2023.11.006. Epub 2023 Nov 7.

SUnet: A multi-organ segmentation network based on multiple attention.SUnet：一种基于多重注意力机制的多器官分割网络。

Comput Biol Med. 2023 Dec;167:107596. doi: 10.1016/j.compbiomed.2023.107596. Epub 2023 Oct 18.

Joint learning-based feature reconstruction and enhanced network for incomplete multi-modal brain tumor segmentation.基于联合学习的特征重构和增强网络用于不完全多模态脑肿瘤分割。

Comput Biol Med. 2023 Sep;163:107234. doi: 10.1016/j.compbiomed.2023.107234. Epub 2023 Jul 4.

LViT: Language Meets Vision Transformer in Medical Image Segmentation.LViT：医学图像分割中语言与视觉Transformer的融合

IEEE Trans Med Imaging. 2024 Jan;43(1):96-107. doi: 10.1109/TMI.2023.3291719. Epub 2024 Jan 2.

Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation.基于伪标签自训练的局部对比损失的半监督医学图像分割。

Med Image Anal. 2023 Jul;87:102792. doi: 10.1016/j.media.2023.102792. Epub 2023 Mar 11.

Cancer statistics, 2023.癌症统计数据，2023 年。

CA Cancer J Clin. 2023 Jan;73(1):17-48. doi: 10.3322/caac.21763.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

SMF-net：用于医学CT图像中精确胰腺肿瘤分割的语义引导多模态融合网络

SMF-net: semantic-guided multimodal fusion network for precise pancreatic tumor segmentation in medical CT image.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献