Hajkarim Morteza Chalabi, Upfal Eli, Vandin Fabio
1Biotech Research and Innovation Centre, University of Copenhagen, Copenhagen, Denmark.
2Department of Computer Science, Brown University, Providence, RI USA.
Algorithms Mol Biol. 2019 Mar 30;14:10. doi: 10.1186/s13015-019-0146-7. eCollection 2019.
We study the problem of identifying differentially mutated subnetworks of a large gene-gene interaction network, that is, subnetworks that display a significant difference in mutation frequency in two sets of cancer samples. We formally define the associated computational problem and show that the problem is NP-hard.
We propose a novel and efficient algorithm, called DAMOKLE, to identify differentially mutated subnetworks given genome-wide mutation data for two sets of cancer samples. We prove that DAMOKLE identifies subnetworks with statistically significant difference in mutation frequency when the data comes from a reasonable generative model, provided enough samples are available.
We test DAMOKLE on simulated and real data, showing that DAMOKLE does indeed find subnetworks with significant differences in mutation frequency and that it provides novel insights into the molecular mechanisms of the disease not revealed by standard methods.
我们研究在大型基因-基因相互作用网络中识别差异突变子网的问题,即,在两组癌症样本中突变频率显示出显著差异的子网。我们正式定义了相关的计算问题,并证明该问题是NP难的。
我们提出了一种新颖且高效的算法,称为DAMOKLE,用于在给定两组癌症样本的全基因组突变数据时识别差异突变子网。我们证明,当数据来自合理的生成模型且有足够的样本时,DAMOKLE能够识别出突变频率具有统计学显著差异的子网。
我们在模拟数据和真实数据上测试了DAMOKLE,结果表明DAMOKLE确实找到了突变频率有显著差异的子网,并且它为该疾病的分子机制提供了标准方法未揭示的新见解。