Department of Entomology, College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China.
School of Life Sciences, Taizhou University, Taizhou, 318000, China.
Sci Data. 2023 Aug 16;10(1):541. doi: 10.1038/s41597-023-02456-w.
The Entomobryoidea, the largest superfamily of Collembola, encompasses over 2,000 species in the world. However, the lack of high-quality genomes hinders our understanding of the evolution and ecology of this group. This study presents a chromosome-level genome of Entomobrya proxima by combining PacBio long reads, Illumina short reads, and Hi-C data. The genome has a size of 362.37 Mb, with a scaffold N50 size of 57.67 Mb, and 97.12% (351.95 Mb) of the assembly is located on six chromosomes. The BUSCO analysis of our assembly indicates a completeness of 96.1% (n = 1,013), including 946 (93.4%) single-copy BUSCOs and 27 (2.7%) duplicated BUSCOs. We identified that the genome contains 22.16% (80.06 Mb) repeat elements and 20,988 predicted protein-coding genes. Gene family evolution analysis of E. proxima identified 177 gene families that underwent significant expansions, which were primarily associated with detoxification and metabolism. Moreover, our inter-genomic synteny analysis showed strong chromosomal synteny between E. proxima and Sinella curviseta. Our study provides valuable genomic information for comprehending the evolution and ecology of Collembola.
弹尾目是等节跳虫目最大的超科,全世界包含超过 2000 种物种。然而,高质量基因组的缺乏阻碍了我们对这一类群的进化和生态学的理解。本研究通过结合 PacBio 长读长、Illumina 短读长和 Hi-C 数据,提供了一种 Entomobrya proxima 的染色体水平基因组。该基因组大小为 362.37 Mb,支架 N50 大小为 57.67 Mb,组装的 97.12%(351.95 Mb)位于 6 条染色体上。我们的组装的 BUSCO 分析表明完整性为 96.1%(n = 1,013),包括 946(93.4%)个单拷贝 BUSCO 和 27(2.7%)个重复 BUSCO。我们确定基因组包含 22.16%(80.06 Mb)重复元件和 20,988 个预测的蛋白质编码基因。E. proxima 的基因家族进化分析鉴定出 177 个经历显著扩张的基因家族,这些基因家族主要与解毒和代谢有关。此外,我们的种间基因同线性分析表明 E. proxima 和 Sinella curviseta 之间存在强烈的染色体同线性。我们的研究为理解等节跳虫目昆虫的进化和生态学提供了有价值的基因组信息。