Jiang Chongyi, Huang Zixia, Meizoso Cynthia, Kumpfmüller Gaby, Wolf Jochen B W, Schielzeth Holger
Population Ecology Group, Institute of Ecology and Evolution, Friedrich Schiller University, Jena, Germany.
School of Biology and Environmental Science, University College Dublin, Dublin, Ireland.
Sci Data. 2025 Jun 2;12(1):922. doi: 10.1038/s41597-025-05280-6.
Grasslands are essential, biodiverse ecosystems providing critical ecosystem services. Despite their ecological and economic value, transcriptomic resources for wild grassland species to support eco-evolutionary and functional genomic studies remain limited. Here, we present full-length transcriptomes for shoot tissue from 25 wild grassland plant species collected from a long-term biodiversity experiment (the Jena Experiment). Using PacBio Iso-Seq technology, we generated a total of 522.45 million subreads, which were assembled into unique transcripts for each species independently. This resulted in an average of 49,180 transcripts per species, of which 68.6% were successfully annotated using the Swiss-Prot database. Furthermore, 40.3% of the transcripts contained complete open reading frames (ORFs), while 31.4% had incomplete ORFs. More than 36.8% of the transcripts were identified as non-coding RNAs. On average, 5.08% of the bases across all transcriptomes were flagged as repetitive elements. This dataset offers a valuable full-length transcriptomic resource for studying gene expression, alternative splicing, and evolutionary patterns in grassland species, paving the way for future research in functional genomics and conservation.
草原是至关重要的、具有生物多样性的生态系统,提供关键的生态系统服务。尽管它们具有生态和经济价值,但用于支持生态进化和功能基因组学研究的野生草原物种的转录组资源仍然有限。在这里,我们展示了从一个长期生物多样性实验(耶拿实验)中收集的25种野生草原植物物种地上组织的全长转录组。使用PacBio Iso-Seq技术,我们总共生成了5.2245亿条子序列,将其分别组装成每个物种的独特转录本。这使得每个物种平均有49180个转录本,其中68.6%使用Swiss-Prot数据库成功注释。此外,40.3%的转录本包含完整的开放阅读框(ORF),而31.4%的转录本具有不完整的ORF。超过36.8%的转录本被鉴定为非编码RNA。所有转录组中平均有5.08%的碱基被标记为重复元件。该数据集为研究草原物种的基因表达、可变剪接和进化模式提供了宝贵的全长转录组资源,为未来功能基因组学和保护研究铺平了道路。