Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada, M5S 1A8.
Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, Canada, M5G 0A3.
Nat Commun. 2020 Feb 5;11(1):734. doi: 10.1038/s41467-019-13929-1.
The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower's background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery.
驱动突变的发现是癌症基因组测序的主要动机之一。在这里,作为 ICGC/TCGA 全基因组泛癌分析(PCAWG)联盟的一部分,该联盟聚合了来自 38 种肿瘤类型的 2658 种癌症的全基因组测序数据,我们描述了 DriverPower,这是一个软件包,它使用突变负担和功能影响证据来识别癌症全基因组中编码和非编码位点的驱动突变。使用来自公共资源的总共 1373 个基因组特征,DriverPower 的背景突变模型解释了多种肿瘤类型中突变率的区域方差高达 93%。通过纳入功能影响评分,我们能够进一步提高驱动发现的准确性。在 PCAWG 项目的 2583 个癌症基因组集合中进行测试,DriverPower 鉴定出 217 个编码和 95 个非编码的驱动候选物。与 PCAWG 驱动和功能解释工作组使用的六个已发表的方法相比,DriverPower 在编码和非编码驱动发现方面均具有最高的 F1 得分。这表明 DriverPower 是一种有效的计算驱动发现框架。