澳大利亚鬃狮蜥(Pogona vitticeps)基因组的高覆盖度测序与注释组装
High-coverage sequencing and annotated assembly of the genome of the Australian dragon lizard Pogona vitticeps.
作者信息
Georges Arthur, Li Qiye, Lian Jinmin, O'Meally Denis, Deakin Janine, Wang Zongji, Zhang Pei, Fujita Matthew, Patel Hardip R, Holleley Clare E, Zhou Yang, Zhang Xiuwen, Matsubara Kazumi, Waters Paul, Graves Jennifer A Marshall, Sarre Stephen D, Zhang Guojie
机构信息
Institute for Applied Ecology, University of Canberra, Canberra, ACT 2601 Australia.
China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China ; Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, Copenhagen, 1350 Denmark.
出版信息
Gigascience. 2015 Sep 28;4:45. doi: 10.1186/s13742-015-0085-2. eCollection 2015.
BACKGROUND
The lizards of the family Agamidae are one of the most prominent elements of the Australian reptile fauna. Here, we present a genomic resource built on the basis of a wild-caught male ZZ central bearded dragon Pogona vitticeps.
FINDINGS
The genomic sequence for P. vitticeps, generated on the Illumina HiSeq 2000 platform, comprised 317 Gbp (179X raw read depth) from 13 insert libraries ranging from 250 bp to 40 kbp. After filtering for low-quality and duplicated reads, 146 Gbp of data (83X) was available for assembly. Exceptionally high levels of heterozygosity (0.85 % of single nucleotide polymorphisms plus sequence insertions or deletions) complicated assembly; nevertheless, 96.4 % of reads mapped back to the assembled scaffolds, indicating that the assembly included most of the sequenced genome. Length of the assembly was 1.8 Gbp in 545,310 scaffolds (69,852 longer than 300 bp), the longest being 14.68 Mbp. N50 was 2.29 Mbp. Genes were annotated on the basis of de novo prediction, similarity to the green anole Anolis carolinensis, Gallus gallus and Homo sapiens proteins, and P. vitticeps transcriptome sequence assemblies, to yield 19,406 protein-coding genes in the assembly, 63 % of which had intact open reading frames. Our assembly captured 99 % (246 of 248) of core CEGMA genes, with 93 % (231) being complete.
CONCLUSIONS
The quality of the P. vitticeps assembly is comparable or superior to that of other published squamate genomes, and the annotated P. vitticeps genome can be accessed through a genome browser available at https://genomics.canberra.edu.au.
背景
鬃狮蜥科蜥蜴是澳大利亚爬行动物区系中最显著的元素之一。在此,我们展示了基于一只野生捕获的ZZ型雄性中部鬃狮蜥(鬃狮蜥)构建的基因组资源。
研究结果
在Illumina HiSeq 2000平台上生成的鬃狮蜥基因组序列,由来自13个插入文库(范围从250 bp到40 kbp)的317 Gbp(原始读段深度为179X)组成。在过滤低质量和重复读段后,有146 Gbp的数据(83X)可用于组装。极高的杂合度水平(单核苷酸多态性加上序列插入或缺失的0.85%)使组装变得复杂;尽管如此,96.4%的读段可映射回组装的支架,这表明组装包含了大部分测序基因组。组装长度为1.8 Gbp,分布在545,310个支架中(69,852个长于300 bp),最长的为14.68 Mbp。N50为2.29 Mbp。基于从头预测、与绿安乐蜥、原鸡和人类蛋白质的相似性以及鬃狮蜥转录组序列组装对基因进行注释,在组装中产生了19,406个蛋白质编码基因,其中63%具有完整的开放阅读框。我们的组装捕获了99%(248个中的246个)的核心CEGMA基因,其中93%(231个)是完整的。
结论
鬃狮蜥的组装质量与其他已发表的有鳞目基因组相当或更优,并且可以通过https://genomics.canberra.edu.au上的基因组浏览器访问注释后的鬃狮蜥基因组。