IEEE Trans Med Imaging. 2022 Sep;41(9):2228-2237. doi: 10.1109/TMI.2022.3161829. Epub 2022 Aug 31.
Automated segmentation in medical image analysis is a challenging task that requires a large amount of manually labeled data. However, most existing learning-based approaches usually suffer from limited manually annotated medical data, which poses a major practical problem for accurate and robust medical image segmentation. In addition, most existing semi-supervised approaches are usually not robust compared with the supervised counterparts, and also lack explicit modeling of geometric structure and semantic information, both of which limit the segmentation accuracy. In this work, we present SimCVD, a simple contrastive distillation framework that significantly advances state-of-the-art voxel-wise representation learning. We first describe an unsupervised training strategy, which takes two views of an input volume and predicts their signed distance maps of object boundaries in a contrastive objective, with only two independent dropout as mask. This simple approach works surprisingly well, performing on the same level as previous fully supervised methods with much less labeled data. We hypothesize that dropout can be viewed as a minimal form of data augmentation and makes the network robust to representation collapse. Then, we propose to perform structural distillation by distilling pair-wise similarities. We evaluate SimCVD on two popular datasets: the Left Atrial Segmentation Challenge (LA) and the NIH pancreas CT dataset. The results on the LA dataset demonstrate that, in two types of labeled ratios (i.e., 20% and 10%), SimCVD achieves an average Dice score of 90.85% and 89.03% respectively, a 0.91% and 2.22% improvement compared to previous best results. Our method can be trained in an end-to-end fashion, showing the promise of utilizing SimCVD as a general framework for downstream tasks, such as medical image synthesis, enhancement, and registration.
医学图像分析中的自动分割是一项具有挑战性的任务,需要大量手动标记的数据。然而,大多数现有的基于学习的方法通常受到有限的手动注释医学数据的限制,这对准确和稳健的医学图像分割提出了重大的实际问题。此外,大多数现有的半监督方法通常不如监督方法稳健,并且缺乏对几何结构和语义信息的显式建模,这两者都限制了分割精度。在这项工作中,我们提出了 SimCVD,这是一种简单的对比蒸馏框架,可显著推进体素级表示学习的最新技术。我们首先描述了一种无监督训练策略,该策略采用输入体的两个视图,并在对比目标中预测它们的物体边界的有符号距离图,仅使用两个独立的随机失活作为掩模。这种简单的方法效果非常好,在使用更少标记数据的情况下,与以前的完全监督方法的性能相当。我们假设随机失活可以看作是数据增强的一种最小形式,使网络对表示崩溃具有鲁棒性。然后,我们建议通过蒸馏成对相似度来执行结构蒸馏。我们在两个流行的数据集上评估了 SimCVD:左心房分割挑战赛(LA)和 NIH 胰腺 CT 数据集。在 LA 数据集上的结果表明,在两种标记比例(即 20%和 10%)下,SimCVD 分别实现了 90.85%和 89.03%的平均 Dice 得分,与以前的最佳结果相比,分别提高了 0.91%和 2.22%。我们的方法可以以端到端的方式进行训练,这表明可以将 SimCVD 用作下游任务(例如医学图像合成、增强和配准)的通用框架。