Su Andrew, Lee HoJoon, Tan Xiao, Suarez Carlos J, Andor Noemi, Nguyen Quan, Ji Hanlee P
Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, 4072, Australia.
Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, USA.
NPJ Precis Oncol. 2022 Mar 2;6(1):14. doi: 10.1038/s41698-022-00252-0.
Deep-learning classification systems have the potential to improve cancer diagnosis. However, development of these computational approaches so far depends on prior pathological annotations and large training datasets. The manual annotation is low-resolution, time-consuming, highly variable and subject to observer variance. To address this issue, we developed a method, H&E Molecular neural network (HEMnet). HEMnet utilizes immunohistochemistry as an initial molecular label for cancer cells on a H&E image and trains a cancer classifier on the overlapping clinical histopathological images. Using this molecular transfer method, HEMnet successfully generated and labeled 21,939 tumor and 8782 normal tiles from ten whole-slide images for model training. After building the model, HEMnet accurately identified colorectal cancer regions, which achieved 0.84 and 0.73 of ROC AUC values compared to p53 staining and pathological annotations, respectively. Our validation study using histopathology images from TCGA samples accurately estimated tumor purity, which showed a significant correlation (regression coefficient of 0.8) with the estimation based on genomic sequencing data. Thus, HEMnet contributes to addressing two main challenges in cancer deep-learning analysis, namely the need to have a large number of images for training and the dependence on manual labeling by a pathologist. HEMnet also predicts cancer cells at a much higher resolution compared to manual histopathologic evaluation. Overall, our method provides a path towards a fully automated delineation of any type of tumor so long as there is a cancer-oriented molecular stain available for subsequent learning. Software, tutorials and interactive tools are available at: https://github.com/BiomedicalMachineLearning/HEMnet.
深度学习分类系统有改善癌症诊断的潜力。然而,到目前为止这些计算方法的开发依赖于先前的病理注释和大型训练数据集。人工注释分辨率低、耗时、高度可变且受观察者差异影响。为解决这个问题,我们开发了一种方法,即苏木精和伊红分子神经网络(HEMnet)。HEMnet将免疫组织化学用作苏木精和伊红图像上癌细胞的初始分子标记,并在重叠的临床组织病理学图像上训练癌症分类器。使用这种分子转移方法,HEMnet成功地从十张全切片图像中生成并标记了21939个肿瘤切片和8782个正常切片用于模型训练。构建模型后,HEMnet准确识别出结直肠癌区域,与p53染色和病理注释相比,其ROC曲线下面积(AUC)值分别达到0.84和0.73。我们使用来自TCGA样本的组织病理学图像进行的验证研究准确估计了肿瘤纯度,其与基于基因组测序数据的估计显示出显著相关性(回归系数为0.8)。因此,HEMnet有助于解决癌症深度学习分析中的两个主要挑战,即需要大量图像进行训练以及对病理学家手动标记的依赖。与手动组织病理学评估相比,HEMnet还能以更高的分辨率预测癌细胞。总体而言,只要有面向癌症的分子染色用于后续学习,我们的方法为全自动描绘任何类型的肿瘤提供了一条途径。软件、教程和交互式工具可在以下网址获取:https://github.com/BiomedicalMachineLearning/HEMnet 。