Life Science & Health, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands.
Genome Data Science, Faculty of Technology, Bielefeld University, Bielefeld, Germany.
Genome Biol. 2021 Oct 27;22(1):299. doi: 10.1186/s13059-021-02512-x.
Haplotype-aware diploid genome assembly is crucial in genomics, precision medicine, and many other disciplines. Long-read sequencing technologies have greatly improved genome assembly. However, current long-read assemblers are either reference based, so introduce biases, or fail to capture the haplotype diversity of diploid genomes. We present phasebook, a de novo approach for reconstructing the haplotypes of diploid genomes from long reads. phasebook outperforms other approaches in terms of haplotype coverage by large margins, in addition to achieving competitive performance in terms of assembly errors and assembly contiguity.
单体型感知的二倍体基因组组装在基因组学、精准医学和许多其他领域都至关重要。长读测序技术极大地提高了基因组组装的质量。然而,目前的长读序列组装方法要么基于参考序列,从而引入偏差,要么无法捕获二倍体基因组的单体型多样性。我们提出了 phasebook,这是一种从长读序列中重建二倍体基因组单体型的从头方法。phasebook 在单体型覆盖度方面的表现优于其他方法,同时在组装错误和组装连续性方面也具有竞争力。