Department of Viticulture and Enology, University of California Davis, Davis, CA 95616, USA.
G3 (Bethesda). 2022 Jul 29;12(8). doi: 10.1093/g3journal/jkac143.
De novo genome assembly is essential for genomic research. High-quality genomes assembled into phased pseudomolecules are challenging to produce and often contain assembly errors because of repeats, heterozygosity, or the chosen assembly strategy. Although algorithms that produce partially phased assemblies exist, haploid draft assemblies that may lack biological information remain favored because they are easier to generate and use. We developed HaploSync, a suite of tools that produces fully phased, chromosome-scale diploid genome assemblies, and performs extensive quality control to limit assembly artifacts. HaploSync scaffolds sequences from a draft diploid assembly into phased pseudomolecules guided by a genetic map and/or the genome of a closely related species. HaploSync generates a report that visualizes the relationships between current and legacy sequences, for both haplotypes, and displays their gene and marker content. This quality control helps the user identify misassemblies and guides Haplosync's correction of scaffolding errors. Finally, HaploSync fills assembly gaps with unplaced sequences and resolves collapsed homozygous regions. In a series of plant, fungal, and animal kingdom case studies, we demonstrate that HaploSync efficiently increases the assembly contiguity of phased chromosomes, improves completeness by filling gaps, corrects scaffolding, and correctly phases highly heterozygous, complex regions.
从头基因组组装对于基因组研究至关重要。将高质量的基因组组装成相联系的假染色体是具有挑战性的,因为重复、杂合性或所选的组装策略常常导致组装错误。尽管存在产生部分相联系组装的算法,但由于缺乏生物学信息,单倍体草案组装仍然受到青睐,因为它们更容易生成和使用。我们开发了 HaploSync,这是一套工具,可生成完全相联系的、染色体规模的二倍体基因组组装,并进行广泛的质量控制,以限制组装伪影。HaploSync 利用遗传图谱和/或密切相关物种的基因组,将来自二倍体草案组装的序列引导到相联系的假染色体中。HaploSync 生成一份报告,可视化当前和遗留序列之间的关系,对于两种单倍型,以及显示它们的基因和标记内容。这种质量控制有助于用户识别组装错误,并指导 Haplosync 纠正支架错误。最后,HaploSync 使用未定位的序列填充组装间隙,并解决了塌陷的纯合区域。在一系列植物、真菌和动物王国的案例研究中,我们证明了 HaploSync 能够有效地提高相联系染色体的组装连续性,通过填充间隙、纠正支架和正确相位高度杂合、复杂区域来提高完整性。