Zhou Shaofeng, Tian Shenwei, Yu Long, Wu Weidong, Zhang Dezhi, Peng Zhen, Zhou Zhicheng
College of Software, Xinjiang University, Urumqi, 830000, China.
Key Laboratory of Software Engineering Technology, Xinjiang University, Urumqi, 830000, China.
Med Biol Eng Comput. 2023 May;61(5):1033-1045. doi: 10.1007/s11517-022-02743-5. Epub 2023 Jan 17.
Recent research on semi-supervised learning (SSL) is mainly based on the method of consistency regularization, which relies on domain-specific data augmentation. Pseudo-labeling is a more general method that has no such restrictions but performs limited by noisy training. We combine both approaches and focus on generating pseudo-labels using domain-independent weak augmentation. In this article, we propose ReFixMatch-LS and apply it to the classification of medical images. First, we reduce the impact of noisy artificial labels by label smoothing and consistent regularization. Then, by recording high-confidence pseudo-labels generated from each epoch during training, we reuse the generated pseudo-labels to train the model in the subsequent epochs. ReFixMatch-LS effectively increases the number of pseudo-labels and improves the model performance. We validate the effectiveness of ReFixMatch-LS on skin lesion diagnosis in the ISIC 2018 and ISIC 2019 challenge datasets, obtaining AUCs of 91.54%, 93.68%, 94.55%, and 95.47% on the four proportions of labeled data from ISIC 2018.
近期关于半监督学习(SSL)的研究主要基于一致性正则化方法,该方法依赖于特定领域的数据增强。伪标签是一种更通用的方法,没有此类限制,但受噪声训练的影响有限。我们将这两种方法结合起来,专注于使用与领域无关的弱增强来生成伪标签。在本文中,我们提出了ReFixMatch-LS并将其应用于医学图像分类。首先,我们通过标签平滑和一致性正则化来减少有噪声的人工标签的影响。然后,通过记录训练期间每个轮次生成的高置信度伪标签,我们在后续轮次中重用生成的伪标签来训练模型。ReFixMatch-LS有效地增加了伪标签的数量并提高了模型性能。我们在ISIC 2018和ISIC 2019挑战数据集中验证了ReFixMatch-LS在皮肤病变诊断方面的有效性,在ISIC 2018的四个标注数据比例上分别获得了91.54%、93.68%、94.55%和95.47%的曲线下面积(AUC)。