IEEE Trans Med Imaging. 2023 Oct;42(10):2832-2841. doi: 10.1109/TMI.2023.3266137. Epub 2023 Oct 2.
A common problem with segmentation of medical images using neural networks is the difficulty to obtain a significant number of pixel-level annotated data for training. To address this issue, we proposed a semi-supervised segmentation network based on contrastive learning. In contrast to the previous state-of-the-art, we introduce Min-Max Similarity (MMS), a contrastive learning form of dual-view training by employing classifiers and projectors to build all-negative, and positive and negative feature pairs, respectively, to formulate the learning as solving a MMS problem. The all-negative pairs are used to supervise the networks learning from different views and to capture general features, and the consistency of unlabeled predictions is measured by pixel-wise contrastive loss between positive and negative pairs. To quantitatively and qualitatively evaluate our proposed method, we test it on four public endoscopy surgical tool segmentation datasets and one cochlear implant surgery dataset, which we manually annotated. Results indicate that our proposed method consistently outperforms state-of-the-art semi-supervised and fully supervised segmentation algorithms. And our semi-supervised segmentation algorithm can successfully recognize unknown surgical tools and provide good predictions. Also, our MMS approach could achieve inference speeds of about 40 frames per second (fps) and is suitable to deal with the real-time video segmentation.
使用神经网络对医学图像进行分割的一个常见问题是难以获得大量用于训练的像素级标注数据。为了解决这个问题,我们提出了一种基于对比学习的半监督分割网络。与之前的最先进方法不同,我们引入了 Min-Max 相似度(MMS),这是一种通过使用分类器和投影器来构建所有负对和正负特征对的双视图训练的对比学习形式,分别将学习表示为解决 MMS 问题。所有负对用于监督网络从不同视图学习并捕获通用特征,并且通过正、负对之间的像素级对比损失来衡量未标记预测的一致性。为了定量和定性地评估我们提出的方法,我们在四个公开的内窥镜手术工具分割数据集和一个手动标注的人工耳蜗植入手术数据集上进行了测试。结果表明,我们提出的方法始终优于最先进的半监督和全监督分割算法。而且,我们的半监督分割算法可以成功识别未知的手术工具并提供良好的预测。此外,我们的 MMS 方法可以实现约 40 帧每秒(fps)的推断速度,并且适合处理实时视频分割。