You Chenyu, Dai Weicheng, Liu Fenglin, Min Yifei, Dvornek Nicha C, Li Xiaoxiao, Clifton David A, Staib Lawrence, Duncan James S
IEEE Trans Pattern Anal Mach Intell. 2024 Sep 13;PP. doi: 10.1109/TPAMI.2024.3461321.
Recent studies on contrastive learning have achieved remarkable performance solely by leveraging few labels in the context of medical image segmentation. Existing methods mainly focus on instance discrimination and invariant mapping (i.e., pulling positive samples closer and negative samples apart in the feature space). However, they face three common pitfalls: (1) tailness: medical image data usually follows an implicit long-tail class distribution. Blindly leveraging all pixels in training hence can lead to the data imbalance issues, and cause deteriorated performance; (2) consistency: it remains unclear whether a segmentation model has learned meaningful and yet consistent anatomical features due to the intra-class variations between different anatomical features; and (3) diversity: the intra-slice correlations within the entire dataset have received significantly less attention. This motivates us to seek a principled approach for strategically making use of the dataset itself to discover similar yet distinct samples from different anatomical views. In this paper, we introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owNAnatomy (MONA), and make three contributions. First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features, mainly due to lacking the supervision signal. We show two simple solutions towards learning invariances-through the use of stronger data augmentations and nearest neighbors. Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features in an unsupervised manner. Lastly, we both empirically and theoretically, demonstrate the efficacy of our MONA on three benchmark datasets with different labeled settings, achieving new state-of-the-art under different labeled semi-supervised settings. MONA makes minimal assumptions on domain expertise, and hence constitutes a practical and versatile solution in medical image analysis. We provide the PyTorch-like pseudo-code in supplementary.
最近关于对比学习的研究仅通过在医学图像分割的背景下利用少量标签就取得了显著的性能。现有方法主要集中在实例判别和不变映射(即在特征空间中将正样本拉近,负样本推开)。然而,它们面临三个常见的陷阱:(1)长尾性:医学图像数据通常遵循隐含的长尾类分布。因此,在训练中盲目利用所有像素会导致数据不平衡问题,并导致性能下降;(2)一致性:由于不同解剖特征之间的类内变化,尚不清楚分割模型是否学习到了有意义且一致的解剖特征;(3)多样性:整个数据集中切片内的相关性受到的关注要少得多。这促使我们寻求一种有原则的方法,策略性地利用数据集本身来从不同的解剖视图中发现相似但不同的样本。在本文中,我们介绍了一种新颖的半监督二维医学图像分割框架,称为挖掘你自己的解剖结构(MONA),并做出了三点贡献。首先,先前的工作认为每个像素对模型训练同样重要;我们通过实验观察到,仅这一点不太可能定义有意义的解剖特征,主要是因为缺乏监督信号。我们展示了两种学习不变性的简单解决方案——通过使用更强的数据增强和最近邻。其次,我们构建了一组目标,鼓励模型能够以无监督的方式将医学图像分解为一组解剖特征。最后,我们通过实验和理论证明了我们的MONA在三个具有不同标注设置的基准数据集上的有效性,在不同的半监督标注设置下达到了新的最优水平。MONA对领域专业知识的假设最少,因此在医学图像分析中构成了一种实用且通用的解决方案。我们在补充材料中提供了类似PyTorch的伪代码。