IEEE Trans Med Imaging. 2021 Oct;40(10):2656-2671. doi: 10.1109/TMI.2020.3045775. Epub 2021 Sep 30.
Medical image segmentation has achieved remarkable advancements using deep neural networks (DNNs). However, DNNs often need big amounts of data and annotations for training, both of which can be difficult and costly to obtain. In this work, we propose a unified framework for generalized low-shot (one- and few-shot) medical image segmentation based on distance metric learning (DML). Unlike most existing methods which only deal with the lack of annotations while assuming abundance of data, our framework works with extreme scarcity of both, which is ideal for rare diseases. Via DML, the framework learns a multimodal mixture representation for each category, and performs dense predictions based on cosine distances between the pixels' deep embeddings and the category representations. The multimodal representations effectively utilize the inter-subject similarities and intraclass variations to overcome overfitting due to extremely limited data. In addition, we propose adaptive mixing coefficients for the multimodal mixture distributions to adaptively emphasize the modes better suited to the current input. The representations are implicitly embedded as weights of the fc layer, such that the cosine distances can be computed efficiently via forward propagation. In our experiments on brain MRI and abdominal CT datasets, the proposed framework achieves superior performances for low-shot segmentation towards standard DNN-based (3D U-Net) and classical registration-based (ANTs) methods, e.g., achieving mean Dice coefficients of 81%/69% for brain tissue/abdominal multi-organ segmentation using a single training sample, as compared to 52%/31% and 72%/35% by the U-Net and ANTs, respectively.
医学图像分割在使用深度神经网络(DNN)方面取得了显著进展。然而,DNN 通常需要大量的数据和标注进行训练,这两者都很难获得,并且成本很高。在这项工作中,我们提出了一种基于距离度量学习(DML)的通用少样本(一次和几次)医学图像分割的统一框架。与大多数只处理缺乏标注而假设数据丰富的现有方法不同,我们的框架在数据和标注都非常稀缺的情况下工作,这非常适合罕见疾病。通过 DML,该框架为每个类别学习一个多模态混合表示,并基于像素的深度嵌入和类别表示之间的余弦距离进行密集预测。多模态表示有效地利用了受试者间的相似性和类内的变化,以克服由于数据极其有限而导致的过拟合。此外,我们为多模态混合分布提出了自适应混合系数,以自适应地强调更适合当前输入的模式。表示被隐式地嵌入到 fc 层的权重中,以便通过前向传播有效地计算余弦距离。在我们对脑 MRI 和腹部 CT 数据集的实验中,所提出的框架在少样本分割方面表现优于基于标准 DNN(3D U-Net)和经典基于配准(ANTs)的方法,例如,使用单个训练样本实现脑组织/腹部多器官分割的平均 Dice 系数分别为 81%/69%,而 U-Net 和 ANTs 分别为 52%/31%和 72%/35%。