Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam, CX 1066, the Netherlands; University of Amsterdam, Science Park 402, Amsterdam, XH 1098, the Netherlands.
University of Amsterdam, Science Park 402, Amsterdam, XH 1098, the Netherlands; Ellogon AI B.V., the Netherlands.
Med Image Anal. 2022 Jul;79:102464. doi: 10.1016/j.media.2022.102464. Epub 2022 Apr 29.
We propose a Deep learning-based weak label learning method for analyzing whole slide images (WSIs) of Hematoxylin and Eosin (H&E) stained tumor tissue not requiring pixel-level or tile-level annotations using Self-supervised pre-training and heterogeneity-aware deep Multiple Instance LEarning (DeepSMILE). We apply DeepSMILE to the task of Homologous recombination deficiency (HRD) and microsatellite instability (MSI) prediction. We utilize contrastive self-supervised learning to pre-train a feature extractor on histopathology tiles of cancer tissue. Additionally, we use variability-aware deep multiple instance learning to learn the tile feature aggregation function while modeling tumor heterogeneity. For MSI prediction in a tumor-annotated and color normalized subset of TCGA-CRC (n=360 patients), contrastive self-supervised learning improves the tile supervision baseline from 0.77 to 0.87 AUROC, on par with our proposed DeepSMILE method. On TCGA-BC (n=1041 patients) without any manual annotations, DeepSMILE improves HRD classification performance from 0.77 to 0.81 AUROC compared to tile supervision with either a self-supervised or ImageNet pre-trained feature extractor. Our proposed methods reach the baseline performance using only 40% of the labeled data on both datasets. These improvements suggest we can use standard self-supervised learning techniques combined with multiple instance learning in the histopathology domain to improve genomic label classification performance with fewer labeled data.
我们提出了一种基于深度学习的弱标签学习方法,用于分析苏木精和伊红(H&E)染色的肿瘤组织的全幻灯片图像(WSIs),而不需要像素级或瓦片级注释,使用自监督预训练和异质性感知深度多实例学习(DeepSMILE)。我们将 DeepSMILE 应用于同源重组缺陷(HRD)和微卫星不稳定性(MSI)预测任务。我们利用对比自监督学习在癌症组织的组织病理学瓦片上预训练特征提取器。此外,我们使用变异性感知的深度多实例学习来学习瓦片特征聚合函数,同时模拟肿瘤异质性。对于 TCGA-CRC(n=360 名患者)中具有肿瘤注释和颜色归一化的肿瘤子集的 MSI 预测,对比自监督学习将瓦片监督基线从 0.77 提高到 0.87 AUROC,与我们提出的 DeepSMILE 方法相当。在没有任何手动注释的 TCGA-BC(n=1041 名患者)上,与使用自监督或 ImageNet 预训练的特征提取器进行瓦片监督相比,DeepSMILE 将 HRD 分类性能从 0.77 提高到 0.81 AUROC。我们提出的方法在两个数据集上仅使用 40%的标记数据即可达到基线性能。这些改进表明,我们可以在组织病理学领域中使用标准的自监督学习技术结合多实例学习来提高基因组标签分类性能,同时使用更少的标记数据。