Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, China.
Department of Breast Surgery, Institute of Breast Disease, Second Hospital of Dalian Medical University, Dalian, China.
Sci Rep. 2019 Apr 11;9(1):5959. doi: 10.1038/s41598-019-42500-7.
High coverage and mutual exclusivity (HCME), which are considered two combinatorial properties of mutations in a collection of driver genes in cancers, have been used to develop mathematical programming models for distinguishing cancer driver gene sets. In this paper, we summarize a weak HCME pattern to justify the description of practical mutation datasets. We then present AWRMP, a method for identifying driver gene sets through the adaptive assignment of appropriate weights to gene candidates to tune the balance between coverage and mutual exclusivity. It embeds the genetic algorithm into the subsampling strategy to provide the optimization results robust against the uncertainty and noise in the data. Using biological datasets, we show that AWRMP can identify driver gene sets that satisfy the weak HCME pattern and outperform the state-of-arts methods in terms of robustness.
高覆盖度和互斥性(HCME)被认为是癌症中一组驱动基因突变的两种组合特性,已被用于开发数学规划模型以区分癌症驱动基因集。在本文中,我们总结了一个弱 HCME 模式,以证明对实际突变数据集的描述是合理的。然后,我们提出了 AWRMP 方法,该方法通过自适应分配适当的权重给候选基因来识别驱动基因集,从而调整覆盖度和互斥性之间的平衡。它将遗传算法嵌入到子采样策略中,以提供对数据中的不确定性和噪声具有鲁棒性的优化结果。使用生物数据集,我们表明 AWRMP 可以识别满足弱 HCME 模式的驱动基因集,并在鲁棒性方面优于现有技术方法。