Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:4740-4744. doi: 10.1109/EMBC48229.2022.9871359.
Advancements in deep learning techniques have proved useful in biomedical image segmentation. However, the large amount of unlabeled data inherent in biomedical imagery, particularly in digital pathology, creates a semi-supervised learning paradigm. Specifically, because of the time consuming nature of producing pixel-wise annotations and the high cost of having a pathologist dedicate time to labeling, there is a large amount of unlabeled data that we wish to utilize in training segmentation algorithms. Pseudo-labeling is one method to leverage the unlabeled data to increase overall model performance. We adapt a method used for image classification pseudo-labeling to select images for segmentation pseudo-labeling and apply it to 3 digital pathology datasets. To select images for pseudo-labeling, we create and explore different thresholds for confidence and uncertainty on an image level basis. Furthermore, we study the relationship between image-level uncertainty and confidence with model performance. We find that the certainty metrics do not consistently correlate with performance intuitively, and abnormal correlations serve as an indicator of a model's ability to produce pseudo-labels that are useful in training. Clinical relevance - The proposed approach adapts image-level confidence and uncertainty measures for segmentation pseudo-labeling on digital pathology datasets. Increased model performance enables better disease quantification for histopathology.
深度学习技术的进步已被证明在生物医学图像分割中非常有用。然而,生物医学图像中固有的大量未标记数据,特别是在数字病理学中,创造了一种半监督学习范例。具体来说,由于生成像素级注释的耗时性质以及让病理学家专门时间进行标记的高昂成本,我们希望在训练分割算法中利用大量未标记数据。伪标签是利用未标记数据来提高整体模型性能的一种方法。我们采用一种用于图像分类伪标签的方法来选择用于分割伪标签的图像,并将其应用于 3 个数字病理学数据集。为了选择用于伪标签的图像,我们在图像级别基础上创建和探索置信度和不确定性的不同阈值。此外,我们研究了图像级不确定性和置信度与模型性能之间的关系。我们发现,确定度指标与性能之间并没有直观地一致相关,异常相关性可作为模型生成有助于训练的伪标签的能力的指标。临床相关性- 所提出的方法适用于数字病理学数据集上的分割伪标签的图像级置信度和不确定性度量。提高模型性能可以更好地对组织病理学进行疾病量化。