La Torre Renato, Hamilton John P, Saucedo-Bazalar Manuel, Caycho Esteban, Vaillancourt Brieanne, Wood Joshua C, Ramírez Manuel, Buell C Robin, Orjeda Gisella
Laboratory of Genomics and Bioinformatics for Biodiversity, Faculty of Biological Sciences, Universidad Nacional Mayor de San Marcos, Lima 15081, Peru.
Center for Applied Genetic Technologies, University of Georgia, Athens, GA 30602, USA.
G3 (Bethesda). 2025 Feb 5;15(2). doi: 10.1093/g3journal/jkae283.
The dry forests of northern Peru are dominated by the legumous tree Neltuma pallida which is adapted to hot arid and semiarid conditions in the tropics. Despite having been successfully introduced in multiple other areas around the world, N. pallida is currently threatened in its native area, where it is invaluable for the dry forest ecosystem and human subsistence. A major tool for enhancing ecosystem conservation and understanding the adaptive properties of N. pallida to dry forest ecosystems is the construction of a reference genome sequence. Here, we report on a high-quality reference genome for N. pallida. The final genome assembly size is 403.7 Mb, consisting of 14 pseudochromosomes and 63 scaffolds with an N50 size of 26.2 Mb and a 34.3% GC content. Use of Benchmarking Universal Single Copy Orthologs revealed 99.2% complete orthologs. Long terminal repeat elements dominated the repetitive sequence content which was 51.2%. Genes were annotated using N. pallida transcripts, plant protein sequences, and ab initio predictions resulting in 22,409 protein-coding genes encoding 24,607 gene models. Comparative genomic analysis showed evidence of rapidly evolving gene families related to disease resistance, transcription factors, and signaling pathways. The chromosome-scale N. pallida reference genome will be a useful resource for understanding plant evolution in extreme and highly variable environments.
秘鲁北部的干旱森林以豆科树木苍白尼尔图马(Neltuma pallida)为主,这种树适应热带地区炎热干旱和半干旱的环境。尽管苍白尼尔图马已在世界其他多个地区成功引种,但它在原生地目前正受到威胁,而在原生地它对干旱森林生态系统和人类生计具有重要价值。构建参考基因组序列是加强生态系统保护以及了解苍白尼尔图马对干旱森林生态系统适应性的一项重要工具。在此,我们报告了苍白尼尔图马的高质量参考基因组。最终的基因组组装大小为403.7兆碱基对(Mb),由14条假染色体和63个支架组成,N50大小为26.2 Mb,GC含量为34.3%。使用基准通用单拷贝直系同源基因评估显示,直系同源基因的完整性为99.2%。长末端重复元件在重复序列中占主导地位,重复序列含量为51.2%。利用苍白尼尔图马的转录本、植物蛋白质序列并通过从头预测对基因进行注释,结果得到22,409个蛋白质编码基因,编码24,607个基因模型。比较基因组分析表明,存在与抗病性、转录因子和信号通路相关的快速进化基因家族的证据。染色体水平的苍白尼尔图马参考基因组将成为了解极端和高度可变环境中植物进化的有用资源。