Suppr超能文献

由 LUSH 工具包组成的快速准确的 DNA 测序变异调用工作流程。

Fast and accurate DNASeq variant calling workflow composed of LUSH toolkit.

机构信息

BGI Genomics, Shenzhen, 518083, China.

Clin Lab, BGI Genomics, Shenzhen, 518083, China.

出版信息

Hum Genomics. 2024 Oct 10;18(1):114. doi: 10.1186/s40246-024-00666-w.

Abstract

BACKGROUND

Whole genome sequencing (WGS) is becoming increasingly prevalent for molecular diagnosis, staging and prognosis because of its declining costs and the ability to detect nearly all genes associated with a patient's disease. The currently widely accepted variant calling pipeline, GATK, is limited in terms of its computational speed and efficiency, which cannot meet the growing analysis needs.

RESULTS

Here, we propose a fast and accurate DNASeq variant calling workflow that is purely composed of tools from LUSH toolkit. The precision and recall measurements indicate that both the LUSH and GATK pipelines exhibit high levels of consistency, with precision and recall rates exceeding 99% on the 30x NA12878 dataset. In terms of processing speed, the LUSH pipeline outperforms the GATK pipeline, completing 30x WGS data analysis in just 1.6 h, which is approximately 17 times faster than GATK. Notably, the LUSH_HC tool completes the processing from BAM to VCF in just 12 min, which is around 76 times faster than GATK.

CONCLUSION

These findings suggest that the LUSH pipeline is a highly promising alternative to the GATK pipeline for WGS data analysis, with the potential to significantly improve bedside analysis of acutely ill patients, large-scale cohort data analysis, and high-throughput variant calling in crop breeding programs. Furthermore, the LUSH pipeline is highly scalable and easily deployable, allowing it to be readily applied to various scenarios such as clinical diagnosis and genomic research.

摘要

背景

全基因组测序(WGS)因其成本的降低和能够检测与患者疾病相关的几乎所有基因,因此在分子诊断、分期和预后方面越来越普及。目前广泛接受的变异调用管道 GATK 在计算速度和效率方面存在局限性,无法满足不断增长的分析需求。

结果

在这里,我们提出了一种快速而准确的 DNASeq 变异调用工作流程,该流程完全由 LUSH 工具包中的工具组成。精度和召回率测量表明,LUSH 和 GATK 管道都表现出高度的一致性,在 30x NA12878 数据集上的精度和召回率均超过 99%。在处理速度方面,LUSH 管道优于 GATK 管道,仅需 1.6 小时即可完成 30x WGS 数据分析,比 GATK 快约 17 倍。值得注意的是,LUSH_HC 工具仅需 12 分钟即可从 BAM 处理到 VCF,比 GATK 快约 76 倍。

结论

这些发现表明,LUSH 管道是 WGS 数据分析中 GATK 管道的一种很有前途的替代方案,有可能显著改善急性病患者的床边分析、大规模队列数据分析以及作物育种计划中的高通量变异调用。此外,LUSH 管道具有高度可扩展性和易于部署性,可轻松应用于临床诊断和基因组研究等各种场景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd73/11465951/be160c980e4d/40246_2024_666_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验