Behera Sairam, Catreux Severine, Rossi Massimiliano, Truong Sean, Huang Zhuoyi, Ruehle Michael, Visvanath Arun, Parnaby Gavin, Roddey Cooper, Onuchic Vitor, Cameron Daniel L, English Adam, Mehtalia Shyamal, Han James, Mehio Rami, Sedlazeck Fritz J
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
Illumina Inc., San Diego, CA, USA.
bioRxiv. 2024 Jan 6:2024.01.02.573821. doi: 10.1101/2024.01.02.573821.
Research and medical genomics require comprehensive and scalable solutions to drive the discovery of novel disease targets, evolutionary drivers, and genetic markers with clinical significance. This necessitates a framework to identify all types of variants independent of their size (e.g., SNV/SV) or location (e.g., repeats). Here we present DRAGEN that utilizes novel methods based on multigenomes, hardware acceleration, and machine learning based variant detection to provide novel insights into individual genomes with ~30min computation time (from raw reads to variant detection). DRAGEN outperforms all other state-of-the-art methods in speed and accuracy across all variant types (SNV, indel, STR, SV, CNV) and further incorporates specialized methods to obtain key insights in medically relevant genes (e.g., HLA, SMN, GBA). We showcase DRAGEN across 3,202 genomes and demonstrate its scalability, accuracy, and innovations to further advance the integration of comprehensive genomics for research and medical applications.
研究和医学基因组学需要全面且可扩展的解决方案,以推动发现具有临床意义的新型疾病靶点、进化驱动因素和遗传标记。这需要一个框架来识别所有类型的变异,而不考虑其大小(例如,单核苷酸变异/结构变异)或位置(例如,重复序列)。在此,我们展示了DRAGEN,它利用基于多基因组、硬件加速和基于机器学习的变异检测的新方法,在约30分钟的计算时间内(从原始 reads 到变异检测)为个体基因组提供新见解。在所有变异类型(单核苷酸变异、插入缺失、短串联重复序列、结构变异、拷贝数变异)方面,DRAGEN在速度和准确性上均优于所有其他现有最先进方法,并且进一步纳入了专门方法以获取医学相关基因(例如,人类白细胞抗原、生存运动神经元蛋白、葡萄糖脑苷脂酶)的关键见解。我们在3202个基因组上展示了DRAGEN,并证明了其可扩展性、准确性和创新性,以进一步推动综合基因组学在研究和医学应用中的整合。