Powell Reid Trenton, Olar Adriana, Narang Shivali, Rao Ganesh, Sulman Erik, Fuller Gregory N, Rao Arvind
Center for Translational Cancer Research, Texas A and M Health Science Center, Institute of Biosciences and Technology, Houston, TX 77030, USA.
Department of Hematopathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
J Pathol Inform. 2017 Mar 10;8:9. doi: 10.4103/jpi.jpi_43_16. eCollection 2017.
BACKGROUND: Glioma, the most common primary brain neoplasm, describes a heterogeneous tumor of multiple histologic subtypes and cellular origins. At clinical presentation, gliomas are graded according to the World Health Organization guidelines (WHO), which reflect the malignant characteristics of the tumor based on histopathological and molecular features. Lower grade diffuse gliomas (LGGs) (WHO Grade II-III) have fewer malignant characteristics than high-grade gliomas (WHO Grade IV), and a better clinical prognosis, however, accurate discrimination of overall survival (OS) remains a challenge. In this study, we aimed to identify tissue-derived image features using a machine learning approach to predict OS in a mixed histology and grade cohort of lower grade glioma patients. To achieve this aim, we used H and E stained slides from the public LGG cohort of The Cancer Genome Atlas (TCGA) to create a machine learned dictionary of "image-derived visual words" associated with OS. We then evaluated the combined efficacy of using these visual words in predicting short versus long OS by training a generalized machine learning model. Finally, we mapped these predictive visual words back to molecular signaling cascades to infer potential drivers of the machine learned survival-associated phenotypes. METHODS: We analyzed digitized histological sections downloaded from the LGG cohort of TCGA using a bag-of-words approach. This method identified a diverse set of histological patterns that were further correlated with OS, histology, and molecular signaling activity using Cox regression, analysis of variance, and Spearman correlation, respectively. A support vector machine (SVM) model was constructed to discriminate patients into short and long OS groups dichotomized at 24-month. RESULTS: This method identified disease-relevant phenotypes associated with OS, some of which are correlated with disease-associated molecular pathways. From these image-derived phenotypes, a generalized SVM model which could discriminate 24-month OS (area under the curve, 0.76) was obtained. CONCLUSION: Here, we demonstrated one potential strategy to incorporate image features derived from H and E stained slides into predictive models of OS. In addition, we showed how these image-derived phenotypic characteristics correlate with molecular signaling activity underlying the etiology or behavior of LGG.
背景:胶质瘤是最常见的原发性脑肿瘤,是一种具有多种组织学亚型和细胞起源的异质性肿瘤。在临床表现上,胶质瘤根据世界卫生组织(WHO)的指南进行分级,该指南基于组织病理学和分子特征反映肿瘤的恶性特征。低级别弥漫性胶质瘤(LGGs)(WHO二级至三级)的恶性特征低于高级别胶质瘤(WHO四级),临床预后较好,然而,准确区分总生存期(OS)仍然是一项挑战。在本研究中,我们旨在使用机器学习方法识别组织衍生的图像特征,以预测低级别胶质瘤患者混合组织学和分级队列中的OS。为实现这一目标,我们使用了来自癌症基因组图谱(TCGA)公共LGG队列的苏木精和伊红(H&E)染色玻片,创建了一个与OS相关的“图像衍生视觉词”的机器学习字典。然后,我们通过训练一个广义机器学习模型来评估使用这些视觉词预测短生存期与长生存期OS的联合效果。最后,我们将这些预测性视觉词映射回分子信号级联,以推断机器学习的生存相关表型的潜在驱动因素。 方法:我们使用词袋法分析了从TCGA的LGG队列下载的数字化组织学切片。该方法识别出了一组多样的组织学模式,分别使用Cox回归、方差分析和Spearman相关性将其与OS、组织学和分子信号活性进一步关联。构建了一个支持向量机(SVM)模型,将患者分为生存期在24个月时二分的短生存期和长生存期组。 结果:该方法识别出了与OS相关的疾病相关表型,其中一些与疾病相关分子途径相关。从这些图像衍生表型中,获得了一个能够区分24个月OS的广义SVM模型(曲线下面积,0.76)。 结论:在此,我们展示了一种将从H&E染色玻片中衍生的图像特征纳入OS预测模型的潜在策略。此外,我们展示了这些图像衍生的表型特征如何与LGG病因或行为背后的分子信号活性相关。
Neuroimage Clin. 2014-8-7
PeerJ Comput Sci. 2021-7-12
JCO Clin Cancer Inform. 2020-11
BMC Bioinformatics. 2020-4-20
Acta Neuropathol. 2016-5-9
Mol Reprod Dev. 2015
Neuro Oncol. 2015-3
N Engl J Med. 2015-6-25
N Engl J Med. 2015-6-25
Artif Intell Med. 2015-6