The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia.
Cell Wall Biology and Utilization Laboratory, ARS USDA, Madison, WI, 53706, USA.
Nat Commun. 2019 Jan 16;10(1):260. doi: 10.1038/s41467-018-08260-0.
Rapid innovation in sequencing technologies and improvement in assembly algorithms have enabled the creation of highly contiguous mammalian genomes. Here we report a chromosome-level assembly of the water buffalo (Bubalus bubalis) genome using single-molecule sequencing and chromatin conformation capture data. PacBio Sequel reads, with a mean length of 11.5 kb, helped to resolve repetitive elements and generate sequence contiguity. All five B. bubalis sub-metacentric chromosomes were correctly scaffolded with centromeres spanned. Although the index animal was partly inbred, 58% of the genome was haplotype-phased by FALCON-Unzip. This new reference genome improves the contig N50 of the previous short-read based buffalo assembly more than a thousand-fold and contains only 383 gaps. It surpasses the human and goat references in sequence contiguity and facilitates the annotation of hard to assemble gene clusters such as the major histocompatibility complex (MHC).
测序技术的快速创新和组装算法的改进,使得高度连续的哺乳动物基因组的创建成为可能。在这里,我们使用单分子测序和染色质构象捕获数据,报告了水牛(Bubalus bubalis)基因组的染色体水平组装。平均长度为 11.5kb 的 PacBio Sequel 读数有助于解析重复元件并生成序列连续性。用着丝粒覆盖的方法正确构建了所有五个 B. bubalis 亚中着丝粒染色体的支架。尽管索引动物部分是近交的,但 FALCON-Unzip 将基因组的 58%进行了单倍型相位划分。这个新的参考基因组将之前基于短读长的水牛组装的 contig N50 提高了一千多倍,并且仅包含 383 个缺口。它在序列连续性方面超过了人类和山羊的参考基因组,并且有利于注释难以组装的基因簇,如主要组织相容性复合体(MHC)。