Wulczyn Ellery, Steiner David F, Moran Melissa, Plass Markus, Reihs Robert, Tan Fraser, Flament-Auvigne Isabelle, Brown Trissia, Regitnig Peter, Chen Po-Hsuan Cameron, Hegde Narayan, Sadhwani Apaar, MacDonald Robert, Ayalew Benny, Corrado Greg S, Peng Lily H, Tse Daniel, Müller Heimo, Xu Zhaoyang, Liu Yun, Stumpe Martin C, Zatloukal Kurt, Mermel Craig H
Google Health, Palo Alto, CA, USA.
Medical University of Graz, Graz, Austria.
NPJ Digit Med. 2021 Apr 19;4(1):71. doi: 10.1038/s41746-021-00427-2.
Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease-specific survival for stage II and III colorectal cancer using 3652 cases (27,300 slides). When evaluated on two validation datasets containing 1239 cases (9340 slides) and 738 cases (7140 slides), respectively, the DLS achieved a 5-year disease-specific survival AUC of 0.70 (95% CI: 0.66-0.73) and 0.69 (95% CI: 0.64-0.72), and added significant predictive value to a set of nine clinicopathologic features. To interpret the DLS, we explored the ability of different human-interpretable features to explain the variance in DLS scores. We observed that clinicopathologic features such as T-category, N-category, and grade explained a small fraction of the variance in DLS scores (R = 18% in both validation sets). Next, we generated human-interpretable histologic features by clustering embeddings from a deep-learning-based image-similarity model and showed that they explained the majority of the variance (R of 73-80%). Furthermore, the clustering-derived feature most strongly associated with high DLS scores was also highly prognostic in isolation. With a distinct visual appearance (poorly differentiated tumor cell clusters adjacent to adipose tissue), this feature was identified by annotators with 87.0-95.5% accuracy. Our approach can be used to explain predictions from a prognostic deep learning model and uncover potentially-novel prognostic features that can be reliably identified by people for future validation studies.
从基于深度学习的预后组织病理学模型中得出可解释的预后特征仍然是一项挑战。在本研究中,我们开发了一种深度学习系统(DLS),用于使用3652例病例(27,300张切片)预测II期和III期结直肠癌的疾病特异性生存率。当分别在包含1239例病例(9340张切片)和738例病例(7140张切片)的两个验证数据集上进行评估时,DLS实现了5年疾病特异性生存率AUC为0.70(95%CI:0.66 - 0.73)和0.69(95%CI:0.64 - 0.72),并为一组九个临床病理特征增加了显著的预测价值。为了解释DLS,我们探索了不同的人类可解释特征解释DLS分数方差的能力。我们观察到,诸如T分期、N分期和分级等临床病理特征仅解释了DLS分数方差的一小部分(两个验证集中R均为18%)。接下来,我们通过对基于深度学习的图像相似性模型的嵌入进行聚类来生成人类可解释的组织学特征,并表明它们解释了大部分方差(R为73 - 80%)。此外,与高DLS分数最强烈相关的聚类衍生特征单独来看也具有高度预后性。具有独特的视觉外观(脂肪组织附近的低分化肿瘤细胞簇),注释者识别该特征的准确率为87.0 - 95.5%。我们的方法可用于解释预后深度学习模型的预测,并发现潜在的新预后特征,这些特征可由人们可靠识别以供未来验证研究使用。