Wu Julie, Bryan Jordan, Rubinstein Samuel M, Wang Lucy, Lenoue-Newton Michele, Zuhour Raed, Levy Mia, Micheel Christine, Xu Yaomin, Bhavnani Suresh K, Mackey Lester, Warner Jeremy L
Department of Internal Medicine, Vanderbilt University, Nashville, TN.
Duke University, Durham, NC.
JCO Precis Oncol. 2020 Jun 25;4. doi: 10.1200/PO.19.00394. eCollection 2020.
Our goal was to identify the opportunities and challenges in analyzing data from the American Association of Cancer Research Project Genomics Evidence Neoplasia Information Exchange (GENIE), a multi-institutional database derived from clinically driven genomic testing, at both the inter- and the intra-institutional level. Inter-institutionally, we identified genotypic differences between primary and metastatic tumors across the 3 most represented cancers in GENIE. Intra-institutionally, we analyzed the clinical characteristics of the Vanderbilt-Ingram Cancer Center (VICC) subset of GENIE to inform the interpretation of GENIE as a whole.
We performed overall cohort matching on the basis of age, ethnicity, and sex of 13,208 patients stratified by cancer type (breast, colon, or lung) and sample site (primary or metastatic). We then determined whether detected variants, at the gene level, were associated with primary or metastatic tumors. We extracted clinical data for the VICC subset from VICC's clinical data warehouse. Treatment exposures were mapped to a 13-class schema derived from the HemOnc ontology.
Across 756 genes, there were significant differences in all cancer types. In breast cancer, variants were over-represented in metastatic samples (odds ratio, 5.91; < 10). mutations were over-represented in metastatic samples across all cancers. VICC had a significantly different cancer type distribution than that of GENIE but patients were well matched with respect to age, sex, and sample type. Treatment data from VICC was used for a bipartite network analysis, demonstrating clusters with a mix of histologies and others being more histology specific.
This article demonstrates the feasibility of deriving meaningful insights from GENIE at the inter- and intra-institutional level and illuminates the opportunities and challenges of the data GENIE contains. The results should help guide future development of GENIE, with the goal of fully realizing its potential for accelerating precision medicine.
我们的目标是识别在机构间和机构内部层面分析来自美国癌症研究协会项目基因组证据肿瘤信息交换库(GENIE)数据时的机遇与挑战,该数据库是一个源自临床驱动基因组检测的多机构数据库。在机构间,我们确定了GENIE中最具代表性的三种癌症的原发性肿瘤和转移性肿瘤之间的基因型差异。在机构内部,我们分析了GENIE中范德比尔特-英格拉姆癌症中心(VICC)子集的临床特征,以辅助对GENIE整体的解读。
我们基于13208名按癌症类型(乳腺癌、结肠癌或肺癌)和样本部位(原发性或转移性)分层的患者的年龄、种族和性别进行总体队列匹配。然后我们确定在基因水平上检测到的变异是否与原发性肿瘤或转移性肿瘤相关。我们从VICC的临床数据仓库中提取了VICC子集的临床数据。治疗暴露情况被映射到一个源自血液肿瘤本体论的13类模式。
在756个基因中,所有癌症类型均存在显著差异。在乳腺癌中,变异在转移性样本中过度富集(优势比,5.91;<0.0001)。在所有癌症中,突变在转移性样本中过度富集。VICC的癌症类型分布与GENIE有显著差异,但患者在年龄、性别和样本类型方面匹配良好。来自VICC的治疗数据用于二分网络分析,显示出组织学混合的聚类以及其他更具组织学特异性的聚类。
本文证明了在机构间和机构内部层面从GENIE中获得有意义见解的可行性,并阐明了GENIE所包含数据的机遇与挑战。这些结果应有助于指导GENIE的未来发展,目标是充分实现其加速精准医学的潜力。