Department of Biology, Center for Computational and Integrative Biology, Rutgers University, 315 Penn St, Camden 08102, NJ, USA.
Gigascience. 2017 Oct 1;6(10):1-7. doi: 10.1093/gigascience/gix091.
Current human whole genome sequencing projects produce massive amounts of data, often creating significant computational challenges. Different approaches have been developed for each type of genome variant and method of its detection, necessitating users to run multiple algorithms to find variants. We present Genome Rearrangement OmniMapper (GROM), a novel comprehensive variant detection algorithm accepting aligned read files as input and finding SNVs, indels, structural variants (SVs), and copy number variants (CNVs). We show that GROM outperforms state-of-the-art methods on 7 validated benchmarks using 2 whole genome sequencing (WGS) data sets. Additionally, GROM boasts lightning-fast run times, analyzing a 50× WGS human data set (NA12878) on commonly available computer hardware in 11 minutes, more than an order of magnitude (up to 72 times) faster than tools detecting a similar range of variants. Addressing the needs of big data analysis, GROM combines in 1 algorithm SNV, indel, SV, and CNV detection, providing superior speed, sensitivity, and precision. GROM is also able to detect CNVs, SNVs, and indels in non-paired-read WGS libraries, as well as SNVs and indels in whole exome or RNA sequencing data sets.
当前的人类全基因组测序项目产生了大量的数据,这常常带来巨大的计算挑战。针对每种类型的基因组变异和检测方法,都开发了不同的方法,这使得用户需要运行多个算法来寻找变异。我们提出了基因组重排全景图(Genome Rearrangement OmniMapper,GROM),这是一种新颖的综合变异检测算法,它接受已对齐的读取文件作为输入,并可发现单核苷酸变异(SNV)、插入缺失(indel)、结构变异(SV)和拷贝数变异(CNV)。我们展示了 GROM 在使用 2 个全基因组测序(WGS)数据集的 7 个经过验证的基准测试中优于最先进的方法。此外,GROM 具有闪电般的运行速度,可在常见的计算机硬件上分析 50×WGS 人类数据集(NA12878),运行时间为 11 分钟,比检测类似变异范围的工具快一个数量级(快 72 倍)以上。为满足大数据分析的需求,GROM 将 SNV、indel、SV 和 CNV 检测组合在 1 个算法中,提供卓越的速度、灵敏度和精度。GROM 还能够检测非配对读取 WGS 文库中的 CNV、SNV 和 indel,以及全外显子或 RNA 测序数据集中的 SNV 和 indel。