Department of Human Cell Biology and Genetics, Joint Laboratory of Guangdong-Hong Kong Universities for Vascular Homeostasis and Diseases, School of Medicine, Southern University of Science and Technology, Shenzhen, China.
National Key Laboratory for Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
Nat Plants. 2024 Aug;10(8):1184-1200. doi: 10.1038/s41477-024-01755-3. Epub 2024 Aug 5.
Scaffolding is crucial for constructing most chromosome-level genomes. The high-throughput chromatin conformation capture (Hi-C) technology has become the primary scaffolding strategy due to its convenience and cost-effectiveness. As sequencing technologies and assembly algorithms advance, constructing haplotype-resolved genomes is increasingly preferred because haplotypes can provide additional genetic information on allelic and non-allelic variations. ALLHiC is a widely used allele-aware scaffolding tool designed for this purpose. However, its dependence on chromosome-level reference genomes and a higher chromosome misassignment rate still impede the unravelling of haplotype-resolved genomes. Here we present HapHiC, a reference-independent allele-aware scaffolding tool with superior performance on chromosome assignment as well as contig ordering and orientation. In addition, we provide new insights into the challenges in allele-aware scaffolding by conducting comprehensive analyses on various adverse factors. Finally, with the help of HapHiC, we constructed the haplotype-resolved allotriploid genome for Miscanthus × giganteus, an important lignocellulosic bioenergy crop.
scaffolding 对于构建大多数染色体水平的基因组至关重要。由于其便利性和成本效益,高通量染色质构象捕获(Hi-C)技术已成为主要的 scaffolding 策略。随着测序技术和组装算法的进步,构建单倍型分辨率基因组越来越受到青睐,因为单倍型可以提供等位基因和非等位基因变异的额外遗传信息。ALLHiC 是一种广泛使用的等位基因感知 scaffolding 工具,专为该目的而设计。然而,它对染色体水平参考基因组的依赖以及更高的染色体错误分配率仍然阻碍了单倍型分辨率基因组的阐明。在这里,我们提出了 HapHiC,这是一种无参考等位基因感知 scaffolding 工具,在染色体分配以及 contig 排序和定向方面具有卓越的性能。此外,我们通过对各种不利因素进行全面分析,为等位基因感知 scaffolding 中的挑战提供了新的见解。最后,在 HapHiC 的帮助下,我们构建了重要的木质纤维素生物能源作物芒属杂种三倍体的单倍型分辨率基因组。