Department of Biomedical Engineering and Sciences, TECNUN, University of Navarra, San Sebastian, Spain.
Bioinformatics. 2022 Jan 12;38(3):844-845. doi: 10.1093/bioinformatics/btab709.
Discover is an algorithm developed to identify mutually exclusive genomic events. Its main contribution is a statistical analysis based on the Poisson-Binomial (PB) distribution to take into account the mutation rate of genes and samples. Discover is very effective for identifying mutually exclusive mutations at the expense of speed in large datasets: the PB is computationally costly to estimate, and checking all the potential mutually exclusive alterations requires millions of tests.
We have implemented a new version of the package called Rediscover that implements exact and approximate computations of the PB. Rediscover exact implementation is slightly faster than Discover for large and medium-sized datasets. The approximation is 100-1000 times faster for them making it possible to get results in less than a minute with a standard desktop. The memory footprint is also smaller in Rediscover. The new package is available at CRAN and provides some functions to integrate its usage with other R packages such as maftools and TCGAbiolinks.
Rediscover is available at CRAN (https://cran.r-project.org/web/packages/Rediscover/index.html).
Supplementary data are available at Bioinformatics online.
Discover 是一种用于识别互斥基因组事件的算法。它的主要贡献是基于泊松二项式(PB)分布的统计分析,以考虑基因和样本的突变率。Discover 在识别互斥突变方面非常有效,但代价是在大型数据集上的速度较慢:PB 的估计计算成本很高,并且检查所有潜在的互斥改变需要数百万次测试。
我们已经实现了一个名为 Rediscover 的软件包的新版本,它实现了 PB 的精确和近似计算。对于大型和中型数据集,Rediscover 的精确实现比 Discover 稍快。对于它们来说,近似值的速度快 100-1000 倍,使得在标准桌面计算机上不到一分钟就可以得到结果。Rediscover 的内存占用也更小。新软件包可在 CRAN 上获得(https://cran.r-project.org/web/packages/Rediscover/index.html),并提供了一些功能,可将其用法与 maftools 和 TCGAbiolinks 等其他 R 软件包集成。
Rediscover 可在 CRAN 上获得(https://cran.r-project.org/web/packages/Rediscover/index.html)。
补充数据可在 Bioinformatics 在线获得。