Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts.
Department of Computer Science, Princeton University, Princeton, New Jersey.
Clin Cancer Res. 2021 May 15;27(10):2868-2878. doi: 10.1158/1078-0432.CCR-20-4119. Epub 2021 Mar 15.
Histopathology evaluation is the gold standard for diagnosing clear cell (ccRCC), papillary, and chromophobe renal cell carcinoma (RCC). However, interrater variability has been reported, and the whole-slide histopathology images likely contain underutilized biological signals predictive of genomic profiles.
To address this knowledge gap, we obtained whole-slide histopathology images and demographic, genomic, and clinical data from The Cancer Genome Atlas, the Clinical Proteomic Tumor Analysis Consortium, and Brigham and Women's Hospital (Boston, MA) to develop computational methods for integrating data analyses. Leveraging these large and diverse datasets, we developed fully automated convolutional neural networks to diagnose renal cancers and connect quantitative pathology patterns with patients' genomic profiles and prognoses.
Our deep convolutional neural networks successfully detected malignancy (AUC in the independent validation cohort: 0.964-0.985), diagnosed RCC histologic subtypes (independent validation AUCs of the best models: 0.953-0.993), and predicted stage I ccRCC patients' survival outcomes (log-rank test = 0.02). Our machine learning approaches further identified histopathology image features indicative of copy-number alterations (AUC > 0.7 in multiple genes in patients with ccRCC) and tumor mutation burden.
Our results suggest that convolutional neural networks can extract histologic signals predictive of patients' diagnoses, prognoses, and genomic variations of clinical importance. Our approaches can systematically identify previously unknown relations among diverse data modalities.
组织病理学评估是诊断透明细胞(ccRCC)、乳头状和嫌色性肾细胞癌(RCC)的金标准。然而,已经报道了观察者间的变异性,并且全切片组织病理学图像可能包含未充分利用的生物信号,这些信号可预测基因组图谱。
为了解决这一知识空白,我们从癌症基因组图谱(The Cancer Genome Atlas)、临床蛋白质组肿瘤分析联盟(Clinical Proteomic Tumor Analysis Consortium)和布莱根妇女医院(波士顿,MA)获得了全切片组织病理学图像以及人口统计学、基因组和临床数据,以开发用于整合数据分析的计算方法。利用这些大型和多样化的数据集,我们开发了全自动卷积神经网络,以诊断肾癌,并将定量病理学模式与患者的基因组图谱和预后联系起来。
我们的深度卷积神经网络成功地检测出了恶性肿瘤(在独立验证队列中的 AUC:0.964-0.985),诊断了 RCC 组织学亚型(最佳模型的独立验证 AUC:0.953-0.993),并预测了 I 期 ccRCC 患者的生存结局(对数秩检验 P=0.02)。我们的机器学习方法还进一步确定了与拷贝数改变(ccRCC 患者多个基因的 AUC>0.7)和肿瘤突变负担相关的组织病理学图像特征。
我们的研究结果表明,卷积神经网络可以提取与患者诊断、预后和具有临床重要性的基因组变异相关的组织学信号。我们的方法可以系统地识别不同数据模式之间以前未知的关系。