Pinto Brendan J, Gamble Tony, Smith Chase H, Keating Shannon E, Havird Justin C, Chiari Ylenia
School of Life Sciences, Arizona State University, Tempe, AZ USA.
Center for Evolution and Medicine, Arizona State University, Tempe, AZ USA.
bioRxiv. 2023 Feb 13:2023.01.20.523807. doi: 10.1101/2023.01.20.523807.
Genomic resources across squamate reptiles (lizards and snakes) have lagged behind other vertebrate systems and high-quality reference genomes remain scarce. Of the 23 chromosome-scale reference genomes across the order, only 12 of the ~60 squamate families are represented. Within geckos (infraorder Gekkota), a species-rich clade of lizards, chromosome-level genomes are exceptionally sparse representing only two of the seven extant families. Using the latest advances in genome sequencing and assembly methods, we generated one of the highest quality squamate genomes to date for the leopard gecko, (Eublepharidae). We compared this assembly to the previous, short-read only, reference genome published in 2016 and examined potential factors within the assembly influencing contiguity of genome assemblies using PacBio HiFi data. Briefly, the read N50 of the PacBio HiFi reads generated for this study was equal to the contig N50 of the previous reference genome at 20.4 kilobases. The HiFi reads were assembled into a total of 132 contigs, which was further scaffolded using HiC data into 75 total sequences representing all 19 chromosomes. We identified that 9 of the 19 chromosomes were assembled as single contigs, while the other 10 chromosomes were each scaffolded together from two or more contigs. We qualitatively identified that percent repeat content within a chromosome broadly affects its assembly contiguity prior to scaffolding. This genome assembly signifies a new age for squamate genomics where high-quality reference genomes rivaling some of the best vertebrate genome assemblies can be generated for a fraction previous cost estimates. This new reference assembly is available on NCBI at JAOPLA010000000. The genome version and its associated annotations are also available via this Figshare repository https://doi.org/10.6084/m9.figshare.20069273 .
有鳞目爬行动物(蜥蜴和蛇)的基因组资源落后于其他脊椎动物系统,高质量的参考基因组仍然稀缺。在整个有鳞目23个染色体水平的参考基因组中,约60个有鳞目家族中只有12个家族有代表。在壁虎(壁虎亚目)中,这是一个物种丰富的蜥蜴类群,染色体水平的基因组异常稀少,仅代表七个现存家族中的两个。利用基因组测序和组装方法的最新进展,我们为豹纹守宫(睑虎科)生成了迄今为止质量最高的有鳞目基因组之一。我们将这个组装结果与之前在2016年发表的仅基于短读长的参考基因组进行了比较,并使用PacBio HiFi数据研究了组装过程中影响基因组组装连续性的潜在因素。简而言之,本研究生成的PacBio HiFi读段的读长N50与之前参考基因组的重叠群N50相等,为20.4千碱基。HiFi读段被组装成总共132个重叠群,然后使用HiC数据进一步构建成75个总序列,代表所有19条染色体。我们发现19条染色体中有9条被组装为单个重叠群,而其他10条染色体则是由两个或更多重叠群共同构建而成。我们定性地确定,染色体中的重复序列含量百分比在构建之前广泛影响其组装连续性。这个基因组组装标志着有鳞目基因组学的一个新时代,在这个时代,可以以低于先前成本估计的一小部分生成与一些最佳脊椎动物基因组组装相媲美的高质量参考基因组。这个新的参考组装在NCBI上可获取,登录号为JAOPLA010000000。基因组版本及其相关注释也可通过这个Figshare存储库获取,网址为https://doi.org/10.6084/m9.figshare.20069273 。