Behera Sairam, Catreux Severine, Rossi Massimiliano, Truong Sean, Huang Zhuoyi, Ruehle Michael, Visvanath Arun, Parnaby Gavin, Roddey Cooper, Onuchic Vitor, Finocchio Andrea, Cameron Daniel L, English Adam, Mehtalia Shyamal, Han James, Mehio Rami, Sedlazeck Fritz J
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
Illumina, Inc., San Diego, CA, USA.
Nat Biotechnol. 2024 Oct 25. doi: 10.1038/s41587-024-02382-1.
Research and medical genomics require comprehensive, scalable methods for the discovery of novel disease targets, evolutionary drivers and genetic markers with clinical significance. This necessitates a framework to identify all types of variants independent of their size or location. Here we present DRAGEN, which uses multigenome mapping with pangenome references, hardware acceleration and machine learning-based variant detection to provide insights into individual genomes, with ~30 min of computation time from raw reads to variant detection. DRAGEN outperforms current state-of-the-art methods in speed and accuracy across all variant types (single-nucleotide variations, insertions or deletions, short tandem repeats, structural variations and copy number variations) and incorporates specialized methods for analysis of medically relevant genes. We demonstrate the performance of DRAGEN across 3,202 whole-genome sequencing datasets by generating fully genotyped multisample variant call format files and demonstrate its scalability, accuracy and innovation to further advance the integration of comprehensive genomics. Overall, DRAGEN marks a major milestone in sequencing data analysis and will provide insights across various diseases, including Mendelian and rare diseases, with a highly comprehensive and scalable platform.
研究和医学基因组学需要全面、可扩展的方法来发现具有临床意义的新型疾病靶点、进化驱动因素和遗传标记。这就需要一个框架来识别所有类型的变异,而不考虑其大小或位置。在此,我们展示了DRAGEN,它使用基于泛基因组参考的多基因组映射、硬件加速和基于机器学习的变异检测,以深入了解个体基因组,从原始读数到变异检测只需约30分钟的计算时间。在所有变异类型(单核苷酸变异、插入或缺失、短串联重复、结构变异和拷贝数变异)方面,DRAGEN在速度和准确性上均优于当前最先进的方法,并纳入了用于分析医学相关基因的专门方法。我们通过生成完全基因分型的多样本变异调用格式文件,展示了DRAGEN在3202个全基因组测序数据集上的性能,并证明了其可扩展性、准确性和创新性,以进一步推动综合基因组学的整合。总体而言,DRAGEN标志着测序数据分析的一个重要里程碑,并将通过一个高度全面且可扩展的平台,为包括孟德尔病和罕见病在内的各种疾病提供深入见解。