IEEE J Biomed Health Inform. 2023 Dec;27(12):5970-5981. doi: 10.1109/JBHI.2023.3314663. Epub 2023 Dec 5.
Early identification of endometrial cancer or precancerous lesions from histopathological images is crucial for precise endometrial medical care, which however is increasing hampered by the relative scarcity of pathologists. Computer-aided diagnosis (CAD) provides an automated alternative for confirming endometrial diseases with either feature-engineered machine learning or end-to-end deep learning (DL). In particular, advanced self-supervised learning alleviates the dependence of supervised learning on large-scale human-annotated data and can be used to pre-train DL models for specific classification tasks. Thereby, we develop a novel self-supervised triplet contrastive learning (SSTCL) model for classifying endometrial histopathological images. Specifically, this model consists of one online branch and two target branches. The second target branch includes a simple yet powerful augmentation module named random mosaic masking (RMM), which functions as an effective regularization by mapping the features of masked images close to those of intact ones. Moreover, we add a bottleneck Transformer (BoT) model into each branch as a self-attention module to learn the global information by considering both content information and relative distances between features at different locations. On public endometrial dataset, our model achieved four-class classification accuracies of 77.31 ± 0.84, 80.87 ± 0.48 and 83.22 ± 0.87% using 20, 50 and 100% labeled images, respectively. When transferred to the in-house dataset, our model obtained a three-class diagnostic accuracy of 96.81% with 95% confidence interval of 95.61-98.02%. On both datasets, our model outperformed state-of-the-art supervised and self-supervised methods. Our model may help pathologists to automatically diagnose endometrial diseases with high accuracy and efficiency using limited human-annotated histopathological images.
从组织病理学图像中早期识别子宫内膜癌或癌前病变对于精确的子宫内膜医疗至关重要,但这一过程越来越受到病理学家相对短缺的阻碍。计算机辅助诊断 (CAD) 为通过特征工程机器学习或端到端深度学习 (DL) 确认子宫内膜疾病提供了一种自动替代方法。特别是,先进的自监督学习减轻了监督学习对大规模人工标注数据的依赖,可用于为特定分类任务预训练 DL 模型。为此,我们开发了一种新的用于分类子宫内膜组织病理学图像的自监督三元对比学习 (SSTCL) 模型。具体来说,该模型由一个在线分支和两个目标分支组成。第二个目标分支包含一个名为随机马赛克掩蔽 (RMM) 的简单而强大的增强模块,它通过将掩蔽图像的特征映射到完整图像的特征附近,起到有效的正则化作用。此外,我们在每个分支中添加了一个瓶颈 Transformer (BoT) 模型作为自注意模块,通过考虑不同位置的特征的内容信息和相对距离来学习全局信息。在公共子宫内膜数据集上,我们的模型在使用 20%、50%和 100%标记图像时,分别实现了 77.31 ± 0.84%、80.87 ± 0.48%和 83.22 ± 0.87%的四分类准确率。当转移到内部数据集时,我们的模型在 95%置信区间为 95.61-98.02%的情况下,获得了 96.81%的三分类诊断准确率。在这两个数据集上,我们的模型均优于最先进的监督和自监督方法。我们的模型可以帮助病理学家使用有限的人工标注组织病理学图像,以高精度和高效率自动诊断子宫内膜疾病。