Computer Science Division, University of California Berkeley, 2626 Hearst Avenue, Berkeley, CA 94720, USA.
Cardiovascular Research Institute, University of California San Francisco, 555 Mission Bay Boulevard South, San Francisco, CA 94158, USA.
Gigascience. 2020 Dec 7;9(12). doi: 10.1093/gigascience/giaa134.
Baboons are a widely used nonhuman primate model for biomedical, evolutionary, and basic genetics research. Despite this importance, the genomic resources for baboons are limited. In particular, the current baboon reference genome Panu_3.0 is a highly fragmented, reference-guided (i.e., not fully de novo) assembly, and its poor quality inhibits our ability to conduct downstream genomic analyses.
Here we present a de novo genome assembly of the olive baboon (Papio anubis) that uses data from several recently developed single-molecule technologies. Our assembly, Panubis1.0, has an N50 contig size of ∼1.46 Mb (as opposed to 139 kb for Panu_3.0) and has single scaffolds that span each of the 20 autosomes and the X chromosome.
We highlight multiple lines of evidence (including Bionano Genomics data, pedigree linkage information, and linkage disequilibrium data) suggesting that there are several large assembly errors in Panu_3.0, which have been corrected in Panubis1.0.
狒狒是一种广泛应用于生物医学、进化和基础遗传学研究的非人类灵长类动物模型。尽管其具有重要意义,但狒狒的基因组资源有限。特别是,当前的狒狒参考基因组 Panu_3.0 是一个高度碎片化的、基于参考的(即不完全从头组装)组装体,其较差的质量限制了我们进行下游基因组分析的能力。
本文展示了一个橄榄狒狒(Papio anubis)的从头基因组组装,该组装使用了几种最近开发的单分子技术的数据。我们的组装 Panubis1.0 的 N50 片段大小约为 1.46 Mb(而 Panu_3.0 为 139 kb),并且具有跨越 20 条常染色体和 X 染色体的单个支架。
我们强调了多种证据(包括 Bionano Genomics 数据、系谱连锁信息和连锁不平衡数据)表明 Panu_3.0 中有几个较大的组装错误,这些错误在 Panubis1.0 中已得到纠正。