The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
Computational Sciences PhD Program, University of Massachusetts-Boston, Boston, MA, USA.
Nat Commun. 2020 Dec 11;11(1):6367. doi: 10.1038/s41467-020-20030-5.
Histopathological images are a rich but incompletely explored data type for studying cancer. Manual inspection is time consuming, making it challenging to use for image data mining. Here we show that convolutional neural networks (CNNs) can be systematically applied across cancer types, enabling comparisons to reveal shared spatial behaviors. We develop CNN architectures to analyze 27,815 hematoxylin and eosin scanned images from The Cancer Genome Atlas for tumor/normal, cancer subtype, and mutation classification. Our CNNs are able to classify TCGA pathologist-annotated tumor/normal status of whole slide images (WSIs) in 19 cancer types with consistently high AUCs (0.995 ± 0.008), as well as subtypes with lower but significant accuracy (AUC 0.87 ± 0.1). Remarkably, tumor/normal CNNs trained on one tissue are effective in others (AUC 0.88 ± 0.11), with classifier relationships also recapitulating known adenocarcinoma, carcinoma, and developmental biology. Moreover, classifier comparisons reveal intra-slide spatial similarities, with an average tile-level correlation of 0.45 ± 0.16 between classifier pairs. Breast cancers, bladder cancers, and uterine cancers have spatial patterns that are particularly easy to detect, suggesting these cancers can be canonical types for image analysis. Patterns for TP53 mutations can also be detected, with WSI self- and cross-tissue AUCs ranging from 0.65-0.80. Finally, we comparatively evaluate CNNs on 170 breast and colon cancer images with pathologist-annotated nuclei, finding that both cellular and intercellular regions contribute to CNN accuracy. These results demonstrate the power of CNNs not only for histopathological classification, but also for cross-comparisons to reveal conserved spatial behaviors across tumors.
组织病理学图像是一种丰富但尚未充分探索的数据类型,可用于研究癌症。手动检查既费时又费力,因此难以用于图像数据挖掘。在这里,我们表明卷积神经网络(CNN)可以系统地应用于各种癌症类型,从而可以进行比较以揭示共享的空间行为。我们开发了 CNN 架构来分析来自癌症基因组图谱的 27815 张苏木精和伊红扫描图像,用于肿瘤/正常、癌症亚型和突变分类。我们的 CNN 能够以始终保持较高 AUC(0.995±0.008)的方式对 19 种癌症类型的 TCGA 病理学家注释的全幻灯片图像(WSI)的肿瘤/正常状态进行分类,以及对分类准确性较低但有显著意义的亚型进行分类(AUC 0.87±0.1)。值得注意的是,在一种组织上训练的肿瘤/正常 CNN 在其他组织上也很有效(AUC 0.88±0.11),并且分类器之间的关系也再现了已知的腺癌、癌和发育生物学。此外,分类器比较揭示了幻灯片内的空间相似性,平均在分类器对之间的瓦片级相关性为 0.45±0.16。乳腺癌、膀胱癌和子宫癌具有特别容易检测到的空间模式,这表明这些癌症可以作为图像分析的典型类型。还可以检测到 TP53 突变的模式,WSI 自我和跨组织 AUC 范围为 0.65-0.80。最后,我们在具有病理学家注释核的 170 张乳腺癌和结肠癌图像上比较了 CNN 的性能,发现细胞内和细胞间区域都有助于 CNN 的准确性。这些结果表明 CNN 的强大功能不仅可用于组织病理学分类,还可用于比较以揭示肿瘤之间的保守空间行为。
Genomics Proteomics Bioinformatics. 2021-12
Comput Biol Med. 2021-9
World J Gastroenterol. 2020-10-28
Bioengineering (Basel). 2025-3-12
Cancer Causes Control. 2025-4
J Am Med Inform Assoc. 2020-5-1
BMC Res Notes. 2019-2-12