Bokhari Yahya, Arodz Tomasz
Department of Computer Science, School of Engineering, Virginia Commonwealth University, 401 W. Main St., Richmond, 23284, VA, USA.
Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, 23284, VA, USA.
BMC Bioinformatics. 2017 Oct 24;18(1):458. doi: 10.1186/s12859-017-1869-4.
Somatic mutations accumulate in human cells throughout life. Some may have no adverse consequences, but some of them may lead to cancer. A cancer genome is typically unstable, and thus more mutations can accumulate in the DNA of cancer cells. An ongoing problem is to figure out which mutations are drivers - play a role in oncogenesis, and which are passengers - do not play a role. One way of addressing this question is through inspection of somatic mutations in DNA of cancer samples from a cohort of patients and detection of patterns that differentiate driver from passenger mutations.
We propose QuaDMutEx, a method that incorporates three novel elements: a new gene set penalty that includes non-linear penalization of multiple mutations in putative sets of driver genes, an ability to adjust the method to handle slow- and fast-evolving tumors, and a computationally efficient method for finding gene sets that minimize the penalty, through a combination of heuristic Monte Carlo optimization and exact binary quadratic programming. Compared to existing methods, the proposed algorithm finds sets of putative driver genes that show higher coverage and lower excess coverage in eight sets of cancer samples coming from brain, ovarian, lung, and breast tumors.
Superior ability to improve on both coverage and excess coverage on different types of cancer shows that QuaDMutEx is a tool that should be part of a state-of-the-art toolbox in the driver gene discovery pipeline. It can detect genes harboring rare driver mutations that may be missed by existing methods. QuaDMutEx is available for download from https://github.com/bokhariy/QuaDMutEx under the GNU GPLv3 license.
体细胞突变在人类细胞的整个生命过程中不断积累。其中一些可能没有不良后果,但有些可能会导致癌症。癌症基因组通常不稳定,因此癌细胞的DNA中会积累更多突变。一个持续存在的问题是弄清楚哪些突变是驱动突变——在肿瘤发生中起作用,哪些是乘客突变——不起作用。解决这个问题的一种方法是检查一组患者癌症样本DNA中的体细胞突变,并检测区分驱动突变和乘客突变的模式。
我们提出了QuaDMutEx方法,该方法包含三个新元素:一种新的基因集惩罚,包括对假定驱动基因集中多个突变的非线性惩罚;一种调整方法以处理缓慢和快速进化肿瘤的能力;以及一种通过启发式蒙特卡罗优化和精确二元二次规划相结合来找到使惩罚最小化的基因集的计算高效方法。与现有方法相比,该算法在来自脑、卵巢、肺和乳腺肿瘤的八组癌症样本中找到了显示出更高覆盖率和更低超额覆盖率的假定驱动基因集。
在不同类型癌症的覆盖率和超额覆盖率方面都具有卓越的改进能力,这表明QuaDMutEx是一种应成为驱动基因发现流程中最先进工具集一部分的工具。它可以检测到现有方法可能遗漏的携带罕见驱动突变的基因。QuaDMutEx可根据GNU GPLv3许可从https://github.com/bokhariy/QuaDMutEx下载。