Department of Regenerative and Infectious Pathology, Hamamatsu University School of Medicine, Hamamatsu, Shizuoka, Japan.
Division of Pathology, Cancer Institute, Japanese Foundation for Cancer Research, Tokyo, Japan.
PLoS One. 2023 May 18;18(5):e0285996. doi: 10.1371/journal.pone.0285996. eCollection 2023.
Deep learning technology has been used in the medical field to produce devices for clinical practice. Deep learning methods in cytology offer the potential to enhance cancer screening while also providing quantitative, objective, and highly reproducible testing. However, constructing high-accuracy deep learning models necessitates a significant amount of manually labeled data, which takes time. To address this issue, we used the Noisy Student Training technique to create a binary classification deep learning model for cervical cytology screening, which reduces the quantity of labeled data necessary. We used 140 whole-slide images from liquid-based cytology specimens, 50 of which were low-grade squamous intraepithelial lesions, 50 were high-grade squamous intraepithelial lesions, and 40 were negative samples. We extracted 56,996 images from the slides and then used them to train and test the model. We trained the EfficientNet using 2,600 manually labeled images to generate additional pseudo labels for the unlabeled data and then self-trained it within a student-teacher framework. Based on the presence or absence of abnormal cells, the created model was used to classify the images as normal or abnormal. The Grad-CAM approach was used to visualize the image components that contributed to the classification. The model achieved an area under the curve of 0.908, accuracy of 0.873, and F1-score of 0.833 with our test data. We also explored the optimal confidence threshold score and optimal augmentation approaches for low-magnification images. Our model efficiently classified normal and abnormal images at low magnification with high reliability, making it a promising screening tool for cervical cytology.
深度学习技术已在医学领域得到应用,用于开发临床实践设备。细胞学中的深度学习方法有可能增强癌症筛查,同时提供定量、客观和高度可重复的检测。然而,构建高精度的深度学习模型需要大量手动标记的数据,这需要时间。为了解决这个问题,我们使用嘈杂学生训练技术构建了一个用于宫颈细胞学筛查的二进制分类深度学习模型,减少了所需的标记数据量。我们使用了 140 张基于液体的细胞学标本的全幻灯片图像,其中 50 张是低级别鳞状上皮内病变,50 张是高级别鳞状上皮内病变,40 张是阴性样本。我们从幻灯片中提取了 56996 张图像,然后使用它们来训练和测试模型。我们使用 2600 张手动标记的图像来训练 EfficientNet,为未标记的数据生成额外的伪标签,然后在师生框架内对其进行自我训练。根据是否存在异常细胞,创建的模型将图像分类为正常或异常。使用 Grad-CAM 方法可视化对分类有贡献的图像组件。我们的测试数据表明,该模型的曲线下面积为 0.908,准确率为 0.873,F1 得分为 0.833。我们还探索了低倍放大图像的最佳置信度阈值得分和最佳增强方法。我们的模型可以高效地对低倍放大的正常和异常图像进行分类,具有高度可靠性,是一种有前途的宫颈细胞学筛查工具。