Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, 01307, Dresden, Germany.
Max Planck Institute for the Physics of Complex Systems, Nöthnitzerstr. 38, 01187, Dresden, Germany.
Gigascience. 2018 Dec 1;7(12):giy141. doi: 10.1093/gigascience/giy141.
Reptiles are a species-rich group with great phenotypic and life history diversity but are highly underrepresented among the vertebrate species with sequenced genomes.
Here, we report a high-quality genome assembly of the tegu lizard, Salvator merianae, the first lacertoid with a sequenced genome. We combined 74X Illumina short-read, 29.8X Pacific Biosciences long-read, and optical mapping data to generate a high-quality assembly with a scaffold N50 value of 55.4 Mb. The contig N50 value of this assembly is 521 Kb, making it the most contiguous reptile assembly so far. We show that the tegu assembly has the highest completeness of coding genes and conserved non-exonic elements (CNEs) compared to other reptiles. Furthermore, the tegu assembly has the highest number of evolutionarily conserved CNE pairs, corroborating a high assembly contiguity in intergenic regions. As in other reptiles, long interspersed nuclear elements comprise the most abundant transposon class. We used transcriptomic data, homology- and de novo gene predictions to annotate 22,413 coding genes, of which 16,995 (76%) likely have human orthologs as inferred by CESAR-derived gene mappings. Finally, we generated a multiple genome alignment comprising 10 squamates and 7 other amniote species and identified conserved regions that are under evolutionary constraint. CNEs cover 38 Mb (1.8%) of the tegu genome, with 3.3 Mb in these elements being squamate specific. In contrast to placental mammal-specific CNEs, very few of these squamate-specific CNEs (<20 Kb) overlap transposons, highlighting a difference in how lineage-specific CNEs originated in these two clades.
The tegu lizard genome together with the multiple genome alignment and comprehensive conserved element datasets provide a valuable resource for comparative genomic studies of reptiles and other amniotes.
爬行动物是一个物种丰富的群体,具有丰富的表型和生活史多样性,但在基因组测序的脊椎动物物种中代表性严重不足。
在这里,我们报告了一种高质量的鬣蜥基因组组装,即 Salvator merianae 的基因组,这是第一个测序的有鳞目基因组。我们结合了 74X Illumina 短读、29.8X Pacific Biosciences 长读和光学图谱数据,生成了一个高质量的组装,其支架 N50 值为 55.4Mb。该组装的 contig N50 值为 521 Kb,使其成为迄今为止最连续的爬行动物组装。我们表明,与其他爬行动物相比,鬣蜥组装具有最高的编码基因完整性和保守的非编码元件(CNE)。此外,鬣蜥组装具有最多的进化保守 CNE 对,证实了基因间区的高组装连续性。与其他爬行动物一样,长散布核元件构成了最丰富的转座子类。我们使用转录组数据、同源和从头基因预测来注释 22413 个编码基因,其中 16995 个(76%)可能具有人类同源物,这是通过 CESAR 衍生的基因映射推断的。最后,我们生成了一个包含 10 种有鳞目和 7 种其他羊膜动物的多基因组比对,并鉴定了受进化约束的保守区域。CNE 覆盖鬣蜥基因组的 38 Mb(1.8%),其中 3.3 Mb 是有鳞目特有的。与胎盘哺乳动物特异性 CNE 不同,这些有鳞目特异性 CNE 中很少有(<20 Kb)与转座子重叠,突出了这两个分支中谱系特异性 CNE 起源方式的差异。
鬣蜥蜥蜴基因组与多基因组比对和全面的保守元件数据集一起,为爬行动物和其他羊膜动物的比较基因组研究提供了有价值的资源。