Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.
Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain.
Bioinformatics. 2019 Nov 1;35(22):4788-4790. doi: 10.1093/bioinformatics/btz501.
Identification of the genomic alterations driving tumorigenesis is one of the main goals in oncogenomics research. Given the evolutionary principles of cancer development, computational methods that detect signals of positive selection in the pattern of tumor mutations have been effectively applied in the search for cancer genes. One of these signals is the abnormal clustering of mutations, which has been shown to be complementary to other signals in the detection of driver genes.
We have developed OncodriveCLUSTL, a new sequence-based clustering algorithm to detect significant clustering signals across genomic regions. OncodriveCLUSTL is based on a local background model derived from the simulation of mutations accounting for the composition of tri- or penta-nucleotide context substitutions observed in the cohort under study. Our method can identify known clusters and bona-fide cancer drivers across cohorts of tumor whole-exomes, outperforming the existing OncodriveCLUST algorithm and complementing other methods based on different signals of positive selection. Our results indicate that OncodriveCLUSTL can be applied to the analysis of non-coding genomic elements and non-human mutations data.
OncodriveCLUSTL is available as an installable Python 3.5 package. The source code and running examples are freely available at https://bitbucket.org/bbglab/oncodriveclustl under GNU Affero General Public License.
Supplementary data are available at Bioinformatics online.
鉴定导致肿瘤发生的基因组改变是肿瘤基因组学研究的主要目标之一。鉴于癌症发展的进化原则,检测肿瘤突变模式中正向选择信号的计算方法已有效地应用于寻找癌症基因。这些信号之一是突变的异常聚类,已证明它在检测驱动基因方面与其他信号互补。
我们开发了一种新的基于序列的聚类算法 OncodriveCLUSTL,用于检测基因组区域中的显著聚类信号。OncodriveCLUSTL 基于从模拟突变中得出的局部背景模型,这些突变考虑了研究队列中观察到的三核苷酸或五核苷酸上下文替换的组成。我们的方法可以识别已知的簇和 bona-fide 癌症驱动基因,跨越肿瘤全外显子组的队列,优于现有的 OncodriveCLUST 算法,并补充基于其他正向选择信号的方法。我们的结果表明,OncodriveCLUSTL 可应用于非编码基因组元件和非人类突变数据的分析。
OncodriveCLUSTL 可用作可安装的 Python 3.5 包。源代码和运行示例可在 https://bitbucket.org/bbglab/oncodriveclustl 下根据 GNU Affero General Public License 免费获得。
补充数据可在 Bioinformatics 在线获得。