BGI Genomics, Shenzhen, 518083, China.
Clin Lab, BGI Genomics, Shenzhen, 518083, China.
Hum Genomics. 2024 Oct 10;18(1):114. doi: 10.1186/s40246-024-00666-w.
Whole genome sequencing (WGS) is becoming increasingly prevalent for molecular diagnosis, staging and prognosis because of its declining costs and the ability to detect nearly all genes associated with a patient's disease. The currently widely accepted variant calling pipeline, GATK, is limited in terms of its computational speed and efficiency, which cannot meet the growing analysis needs.
Here, we propose a fast and accurate DNASeq variant calling workflow that is purely composed of tools from LUSH toolkit. The precision and recall measurements indicate that both the LUSH and GATK pipelines exhibit high levels of consistency, with precision and recall rates exceeding 99% on the 30x NA12878 dataset. In terms of processing speed, the LUSH pipeline outperforms the GATK pipeline, completing 30x WGS data analysis in just 1.6 h, which is approximately 17 times faster than GATK. Notably, the LUSH_HC tool completes the processing from BAM to VCF in just 12 min, which is around 76 times faster than GATK.
These findings suggest that the LUSH pipeline is a highly promising alternative to the GATK pipeline for WGS data analysis, with the potential to significantly improve bedside analysis of acutely ill patients, large-scale cohort data analysis, and high-throughput variant calling in crop breeding programs. Furthermore, the LUSH pipeline is highly scalable and easily deployable, allowing it to be readily applied to various scenarios such as clinical diagnosis and genomic research.
全基因组测序(WGS)因其成本的降低和能够检测与患者疾病相关的几乎所有基因,因此在分子诊断、分期和预后方面越来越普及。目前广泛接受的变异调用管道 GATK 在计算速度和效率方面存在局限性,无法满足不断增长的分析需求。
在这里,我们提出了一种快速而准确的 DNASeq 变异调用工作流程,该流程完全由 LUSH 工具包中的工具组成。精度和召回率测量表明,LUSH 和 GATK 管道都表现出高度的一致性,在 30x NA12878 数据集上的精度和召回率均超过 99%。在处理速度方面,LUSH 管道优于 GATK 管道,仅需 1.6 小时即可完成 30x WGS 数据分析,比 GATK 快约 17 倍。值得注意的是,LUSH_HC 工具仅需 12 分钟即可从 BAM 处理到 VCF,比 GATK 快约 76 倍。
这些发现表明,LUSH 管道是 WGS 数据分析中 GATK 管道的一种很有前途的替代方案,有可能显著改善急性病患者的床边分析、大规模队列数据分析以及作物育种计划中的高通量变异调用。此外,LUSH 管道具有高度可扩展性和易于部署性,可轻松应用于临床诊断和基因组研究等各种场景。