Fritz A, Bremges A, Deng Z-L, Lesker T-R, Götting J, Ganzenmüller T, Sczyrba A, Dilthey A, Klawonn F, McHardy A C
BIFO, Department of Computational Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany.
DZIF, German Centre for Infection Research.
bioRxiv. 2021 Jan 26:2021.01.25.428049. doi: 10.1101/2021.01.25.428049.
In viral infections often multiple related viral strains are present, due to coinfection or within-host evolution. We describe Haploflow, a de Bruijn graph-based assembler for genome assembly of viral strains from mixed sequence samples using a novel flow algorithm. We assessed Haploflow across multiple benchmark data sets of increasing complexity, showing that Haploflow is faster and more accurate than viral haplotype assemblers and generic metagenome assemblers not aiming to reconstruct strains. Haplotype reconstructed high-quality strain-resolved assemblies from clinical HCMV samples and SARS-CoV-2 genomes from wastewater metagenomes identical to genomes from clinical isolates.
在病毒感染中,由于共感染或宿主内进化,通常会存在多种相关的病毒株。我们描述了Haploflow,这是一种基于德布鲁因图的汇编程序,用于使用一种新颖的流算法从混合序列样本中对病毒株进行基因组组装。我们在多个复杂度不断增加的基准数据集上评估了Haploflow,结果表明Haploflow比病毒单倍型汇编程序和不以重建毒株为目标的通用宏基因组汇编程序更快、更准确。Haploflow从临床HCMV样本和废水宏基因组中的SARS-CoV-2基因组重建了高质量的菌株解析组装体,这些组装体与临床分离株的基因组相同。