• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

AMVLM:用于半监督医学图像分割的对齐-多样性感知视觉语言模型

AMVLM: Alignment-Multiplicity Aware Vision-Language Model for Semi-Supervised Medical Image Segmentation.

作者信息

Pan Qingtao, Li Zhengrong, Qiao Wenhao, Lou Jingjiao, Yang Qing, Yang Guang, Ji Bing

出版信息

IEEE Trans Med Imaging. 2025 May 23;PP. doi: 10.1109/TMI.2025.3573018.

DOI:10.1109/TMI.2025.3573018
PMID:40408220
Abstract

Low-quality pseudo labels pose a significant obstacle in semi-supervised medical image segmentation (SSMIS), impeding consistency learning on unlabeled data. Leveraging vision-language model (VLM) holds promise in ameliorating pseudo label quality by employing textual prompts to delineate segmentation regions, but it faces the challenge of cross-modal alignment uncertainty due to multiple correspondences (multiple images/texts tend to correspond to one text/image). Existing VLMs address this challenge by modeling semantics as distributions but such distributions lead to semantic degradation. To address these problems, we propose Alignment-Multiplicity Aware Vision-Language Model (AMVLM), a new VLM pre-training paradigm with two novel similarity metric strategies. (i) Cross-modal Similarity Supervision (CSS) proposes a probability distribution transformer to supervise similarity scores across fine-granularity semantics through measuring cross-modal distribution disparities, thus learning cross-modal multiple alignments. (ii) Intra-modal Contrastive Learning (ICL) takes into account the similarity metric of coarse-fine granularity information within each modality to encourage cross-modal semantic consistency. Furthermore, using the pretrained AMVLM, we propose a pioneering text-guided SSMIS network to compensate for the quality deficiencies of pseudo-labels. This network incorporates a text mask generator to produce multimodal supervision information, enhancing pseudo label quality and the model's consistency learning. Extensive experimentation validates the efficacy of our AMVLM-driven SSMIS, showcasing superior performance across four publicly available datasets. The code will be available at: https://github.com/QingtaoPan/AMVLM.

摘要

低质量伪标签在半监督医学图像分割(SSMIS)中构成了重大障碍,阻碍了对未标记数据的一致性学习。利用视觉语言模型(VLM)有望通过使用文本提示来描绘分割区域来改善伪标签质量,但由于存在多种对应关系(多个图像/文本往往对应于一个文本/图像),它面临跨模态对齐不确定性的挑战。现有的VLM通过将语义建模为分布来应对这一挑战,但这种分布会导致语义退化。为了解决这些问题,我们提出了对齐-多重感知视觉语言模型(AMVLM),这是一种具有两种新颖相似性度量策略的新型VLM预训练范式。(i)跨模态相似性监督(CSS)提出了一种概率分布变换器,通过测量跨模态分布差异来监督细粒度语义上的相似性分数,从而学习跨模态多重对齐。(ii)模态内对比学习(ICL)考虑了每个模态内粗细粒度信息的相似性度量,以鼓励跨模态语义一致性。此外,使用预训练的AMVLM,我们提出了一个开创性的文本引导SSMIS网络,以弥补伪标签的质量缺陷。该网络包含一个文本掩码生成器,以产生多模态监督信息,提高伪标签质量和模型的一致性学习。广泛的实验验证了我们的AMVLM驱动的SSMIS的有效性,在四个公开可用数据集上展示了卓越的性能。代码将在以下网址提供:https://github.com/QingtaoPan/AMVLM 。

相似文献

1
AMVLM: Alignment-Multiplicity Aware Vision-Language Model for Semi-Supervised Medical Image Segmentation.AMVLM:用于半监督医学图像分割的对齐-多样性感知视觉语言模型
IEEE Trans Med Imaging. 2025 May 23;PP. doi: 10.1109/TMI.2025.3573018.
2
A segment anything model-guided and match-based semi-supervised segmentation framework for medical imaging.一种用于医学成像的基于段式分割模型引导和匹配的半监督分割框架。
Med Phys. 2025 Mar 29. doi: 10.1002/mp.17785.
3
SEMI-PLC: A framework for semi-supervised medical images segmentation with pseudo label correction.SEMI-PLC:一种用于带有伪标签校正的半监督医学图像分割的框架。
Comput Methods Programs Biomed. 2025 Aug 23;271:109027. doi: 10.1016/j.cmpb.2025.109027.
4
VLM-CPL: Consensus Pseudo-Labels From Vision-Language Models for Annotation-Free Pathological Image Classification.
IEEE Trans Med Imaging. 2025 Oct;44(10):4023-4036. doi: 10.1109/TMI.2025.3595111.
5
Uncertainty-Aware Cross-Training for Semi-Supervised Medical Image Segmentation.用于半监督医学图像分割的不确定性感知交叉训练
IEEE Trans Image Process. 2025;34:5543-5556. doi: 10.1109/TIP.2025.3599783.
6
Uncertainty-aware collaborative learning with mixed images for semi-supervised medical image segmentation.
Comput Methods Programs Biomed. 2026 Jan;273:109128. doi: 10.1016/j.cmpb.2025.109128. Epub 2025 Oct 18.
7
Boundary-Guided Contrastive Learning for Semi-Supervised Medical Image Segmentation.用于半监督医学图像分割的边界引导对比学习
IEEE Trans Med Imaging. 2025 Jul;44(7):2973-2988. doi: 10.1109/TMI.2025.3556482.
8
Uncertainty-guided cross learning via CNN and transformer for semi-supervised honeycomb lung lesion segmentation.基于 CNN 和 Transformer 的不确定性引导交叉学习在半监督蜂窝肺病变分割中的应用。
Phys Med Biol. 2023 Dec 11;68(24). doi: 10.1088/1361-6560/ad0eb2.
9
Cross-Modal self-supervised vision language pre-training with multiple objectives for medical visual question answering.用于医学视觉问答的多目标跨模态自监督视觉语言预训练
J Biomed Inform. 2024 Dec;160:104748. doi: 10.1016/j.jbi.2024.104748. Epub 2024 Nov 12.
10
An intra- and inter-class context and consistency network for supervised and semi-supervised blastocyst segmentation.
Sci Rep. 2025 Oct 9;15(1):35286. doi: 10.1038/s41598-025-19155-8.