Leiserson Mark D M, Reyna Matthew A, Raphael Benjamin J
Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA.
Bioinformatics. 2016 Sep 1;32(17):i736-i745. doi: 10.1093/bioinformatics/btw462.
The somatic mutations in the pathways that drive cancer development tend to be mutually exclusive across tumors, providing a signal for distinguishing driver mutations from a larger number of random passenger mutations. This mutual exclusivity signal can be confounded by high and highly variable mutation rates across a cohort of samples. Current statistical tests for exclusivity that incorporate both per-gene and per-sample mutational frequencies are computationally expensive and have limited precision.
We formulate a weighted exact test for assessing the significance of mutual exclusivity in an arbitrary number of mutational events. Our test conditions on the number of samples with a mutation as well as per-event, per-sample mutation probabilities. We provide a recursive formula to compute P-values for the weighted test exactly as well as a highly accurate and efficient saddlepoint approximation of the test. We use our test to approximate a commonly used permutation test for exclusivity that conditions on per-event, per-sample mutation frequencies. However, our test is more efficient and it recovers more significant results than the permutation test. We use our Weighted Exclusivity Test (WExT) software to analyze hundreds of colorectal and endometrial samples from The Cancer Genome Atlas, which are two cancer types that often have extremely high mutation rates. On both cancer types, the weighted test identifies sets of mutually exclusive mutations in cancer genes with fewer false positives than earlier approaches.
See http://compbio.cs.brown.edu/projects/wext for software.
Supplementary data are available at Bioinformatics online.
驱动癌症发展的信号通路中的体细胞突变在肿瘤之间往往是相互排斥的,这为从大量随机的乘客突变中区分驱动突变提供了一个信号。这种相互排斥信号可能会因一组样本中高且高度可变的突变率而混淆。当前用于评估排他性的统计测试,同时纳入了每个基因和每个样本的突变频率,计算成本高昂且精度有限。
我们制定了一种加权精确测试,用于评估任意数量突变事件中相互排斥性的显著性。我们的测试以发生突变的样本数量以及每个事件、每个样本的突变概率为条件。我们提供了一个递归公式来精确计算加权测试的P值,以及该测试的一个高度准确且高效的鞍点近似值。我们使用我们的测试来近似一种常用的基于每个事件、每个样本突变频率的排他性排列测试。然而,我们的测试更高效,并且比排列测试能得出更显著的结果。我们使用我们的加权排他性测试(WExT)软件分析了来自癌症基因组图谱的数百个结肠直肠癌和子宫内膜癌样本,这两种癌症类型通常具有极高的突变率。在这两种癌症类型中,加权测试识别出癌症基因中相互排斥的突变集,其假阳性比早期方法更少。
软件见http://compbio.cs.brown.edu/projects/wext 。
补充数据可在《生物信息学》在线获取。