Department of Computer Science, Bioinformatics Facility of Xavier NIH RCMI Cancer Research Center, Xavier University of Louisiana, New Orleans, LA, USA.
Department of Pathology, Tulane School of Medicine, Tulane Cancer Center, Tulane University, New Orleans, LA, USA.
Bioinformatics. 2018 Jul 1;34(13):i404-i411. doi: 10.1093/bioinformatics/bty232.
Somatic mutations in proto-oncogenes and tumor suppressor genes constitute a major category of causal genetic abnormalities in tumor cells. The mutation spectra of thousands of tumors have been generated by The Cancer Genome Atlas (TCGA) and other whole genome (exome) sequencing projects. A promising approach to utilizing these resources for precision medicine is to identify genetic similarity-based sub-types within a cancer type and relate the pinpointed sub-types to the clinical outcomes and pathologic characteristics of patients.
We propose two novel methods, ccpwModel and xGeneModel, for mutation-based clustering of tumors. In the former, binary variables indicating the status of cancer driver genes in tumors and the genes' involvement in the core cancer pathways are treated as the features in the clustering process. In the latter, the functional similarities of putative cancer driver genes and their confidence scores as the 'true' driver genes are integrated with the mutation spectra to calculate the genetic distances between tumors. We apply both methods to the TCGA data of 16 cancer types. Promising results are obtained when these methods are compared to state-of-the-art approaches as to the associations between the determined tumor clusters and patient race (or survival time). We further extend the analysis to detect mutation-characterized transcriptomic prognostic signatures, which are directly relevant to the etiology of carcinogenesis.
R codes and example data for ccpwModel and xGeneModel can be obtained from http://webusers.xula.edu/kzhang/ISMB2018/ccpw_xGene_software.zip.
Supplementary data are available at Bioinformatics online.
原癌基因和肿瘤抑制基因中的体细胞突变构成了肿瘤细胞中主要的因果遗传异常类别。癌症基因组图谱(TCGA)和其他全基因组(外显子组)测序项目已经产生了数千种肿瘤的突变谱。利用这些资源进行精准医学的一种很有前景的方法是在一种癌症类型内识别基于遗传相似性的亚型,并将确定的亚型与患者的临床结果和病理特征联系起来。
我们提出了两种新的基于突变的肿瘤聚类方法,ccpwModel 和 xGeneModel。在前一种方法中,肿瘤中癌症驱动基因的状态和基因参与核心癌症途径的二进制变量被视为聚类过程中的特征。在后一种方法中,假定的癌症驱动基因的功能相似性及其作为“真实”驱动基因的置信分数与突变谱相结合,以计算肿瘤之间的遗传距离。我们将这两种方法应用于 16 种癌症类型的 TCGA 数据。与最先进的方法相比,这些方法在确定的肿瘤簇与患者种族(或生存时间)之间的关联方面取得了有希望的结果。我们进一步扩展分析以检测突变特征转录组预后标志物,这与肿瘤发生的病因直接相关。
ccpwModel 和 xGeneModel 的 R 代码和示例数据可从 http://webusers.xula.edu/kzhang/ISMB2018/ccpw_xGene_software.zip 获得。
补充数据可在生物信息学在线获得。