Zhang Siqi, Zhang Qizhe, Zhang Shanghang, Liu Xiaohong, Yue Jingkun, Lu Ming, Xu Huihuan, Yao Jiaxin, Wei Xiaobao, Cao Jiajun, Zhang Xiang, Gao Ming, Shen Jun, Hao Yichang, Wang Yinkui, Zhang Xingcai, Wu Song, Zhang Ping, Cui Shuguang, Wang Guangyu
State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China.
State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University, Beijing, China.
Nat Biomed Eng. 2025 Sep 5. doi: 10.1038/s41551-025-01497-3.
Vision foundation models have demonstrated vast potential in achieving generalist medical segmentation capability, providing a versatile, task-agnostic solution through a single model. However, current generalist models involve simple pre-training on various medical data containing irrelevant information, often resulting in the negative transfer phenomenon and degenerated performance. Furthermore, the practical applicability of foundation models across diverse open-world scenarios, especially in out-of-distribution (OOD) settings, has not been extensively evaluated. Here we construct a publicly accessible database, MedSegDB, based on a tree-structured hierarchy and annotated from 129 public medical segmentation repositories and 5 in-house datasets. We further propose a Generalist Medical Segmentation model (MedSegX), a vision foundation model trained with a model-agnostic Contextual Mixture of Adapter Experts (ConMoAE) for open-world segmentation. We conduct a comprehensive evaluation of MedSegX across a range of medical segmentation tasks. Experimental results indicate that MedSegX achieves state-of-the-art performance across various modalities and organ systems in in-distribution (ID) settings. In OOD and real-world clinical settings, MedSegX consistently maintains its performance in both zero-shot and data-efficient generalization, outperforming other foundation models.
视觉基础模型在实现通用医学分割能力方面已展现出巨大潜力,通过单个模型提供了一种通用的、与任务无关的解决方案。然而,当前的通用模型在包含无关信息的各种医学数据上进行简单的预训练,常常导致负迁移现象和性能退化。此外,基础模型在各种开放世界场景中的实际适用性,尤其是在分布外(OOD)设置中的适用性,尚未得到广泛评估。在此,我们基于树状结构层次构建了一个可公开访问的数据库MedSegDB,并从129个公共医学分割存储库和5个内部数据集进行标注。我们进一步提出了一种通用医学分割模型(MedSegX),这是一种视觉基础模型,使用与模型无关的适配器专家上下文混合(ConMoAE)进行训练以用于开放世界分割。我们对MedSegX在一系列医学分割任务上进行了全面评估。实验结果表明,MedSegX在分布内(ID)设置下的各种模态和器官系统中均实现了当前最优性能。在OOD和真实世界临床设置中,MedSegX在零样本和数据高效泛化方面均持续保持其性能,优于其他基础模型。