Department of Computer Science, Rice University, 6100 Main Street, Houston, TX, 77005, USA.
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
Nat Commun. 2022 Mar 14;13(1):1321. doi: 10.1038/s41467-022-28852-1.
Infectious disease monitoring on Oxford Nanopore Technologies (ONT) platforms offers rapid turnaround times and low cost. Tracking low frequency intra-host variants provides important insights with respect to elucidating within-host viral population dynamics and transmission. However, given the higher error rate of ONT, accurate identification of intra-host variants with low allele frequencies remains an open challenge with no viable computational solutions available. In response to this need, we present Variabel, a novel approach and first method designed for rescuing low frequency intra-host variants from ONT data alone. We evaluate Variabel on both synthetic data (SARS-CoV-2) and patient derived datasets (Ebola virus, norovirus, SARS-CoV-2); our results show that Variabel can accurately identify low frequency variants below 0.5 allele frequency, outperforming existing state-of-the-art ONT variant callers for this task. Variabel is open-source and available for download at: www.gitlab.com/treangenlab/variabel .
基于牛津纳米孔技术(ONT)平台的传染病监测具有快速周转时间和低成本的优势。跟踪低频率的宿主内变异体提供了重要的见解,有助于阐明宿主内病毒群体动态和传播。然而,鉴于 ONT 的错误率较高,准确识别低频的宿主内变异体仍然是一个尚未解决的挑战,目前还没有可行的计算解决方案。针对这一需求,我们提出了 Variabel,这是一种新颖的方法,也是第一个专门用于从 ONT 数据中单独提取低频宿主内变异体的方法。我们在合成数据(SARS-CoV-2)和患者来源数据集(埃博拉病毒、诺如病毒、SARS-CoV-2)上评估了 Variabel;我们的结果表明,Variabel 可以准确识别低于 0.5 等位基因频率的低频变异体,在这项任务上优于现有的最先进的 ONT 变异体调用器。Variabel 是开源的,并可在以下网址下载:www.gitlab.com/treangenlab/variabel。