CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Menglun, Mengla, 666303, Yunnan, China.
College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
BMC Genomics. 2021 Oct 14;22(Suppl 3):739. doi: 10.1186/s12864-021-08047-6.
Long non-coding RNAs (lncRNAs) play vital roles in many important biological processes in plants. Currently, a large fraction of plant lncRNA studies center at lncRNA identification and functional analysis. Only a few plant lncRNA studies focus on understanding their evolutionary history, which is crucial for an in-depth understanding of lncRNAs. Therefore, the integration of large volumes of plant lncRNA data is required to deeply investigate the evolution of lncRNAs.
We present a large-scale evolutionary analysis of lncRNAs in 25 flowering plants. In total, we identified 199,796 high-confidence lncRNAs through data integration analysis, and grouped them into 5497 lncRNA orthologous families. Then, we divided the lncRNAs into groups based on the degree of sequence conservation, and quantified the various characteristics of 756 conserved Arabidopsis thaliana lncRNAs. We found that compared with non-conserved lncRNAs, conserved lncRNAs might have more exons, longer sequence length, higher expression levels, and lower tissue specificities. Functional annotation based on the A. thaliana coding-lncRNA gene co-expression network suggested potential functions of conserved lncRNAs including autophagy, locomotion, and cell cycle. Enrichment analysis revealed that the functions of conserved lncRNAs were closely related to the growth and development of the tissues in which they were specifically expressed.
Comprehensive integration of large-scale lncRNA data and construction of a phylogenetic tree with orthologous lncRNA families from 25 flowering plants was used to provide an oversight of the evolutionary history of plant lncRNAs including origin, conservation, and orthologous relationships. Further analysis revealed a differential characteristic profile for conserved lncRNAs in A. thaliana when compared with non-conserved lncRNAs. We also examined tissue specific expression and the potential functional roles of conserved lncRNAs. The results presented here will further our understanding of plant lncRNA evolution, and provide the basis for further in-depth studies of their functions.
长非编码 RNA(lncRNA)在植物的许多重要生物学过程中发挥着重要作用。目前,大量的植物 lncRNA 研究集中在 lncRNA 的鉴定和功能分析上。只有少数植物 lncRNA 研究关注于理解它们的进化历史,这对于深入了解 lncRNAs 至关重要。因此,需要整合大量的植物 lncRNA 数据来深入研究 lncRNAs 的进化。
我们对 25 种开花植物中的 lncRNA 进行了大规模的进化分析。通过数据整合分析,共鉴定出 199796 个高可信度的 lncRNA,并将它们分为 5497 个 lncRNA 直系同源家族。然后,我们根据序列保守程度将 lncRNAs 分为不同的组,并对 756 个保守的拟南芥 lncRNA 的各种特征进行了量化。我们发现,与非保守 lncRNAs 相比,保守的 lncRNAs 可能具有更多的外显子、更长的序列长度、更高的表达水平和更低的组织特异性。基于拟南芥编码 lncRNA 基因共表达网络的功能注释表明,保守的 lncRNAs 可能具有自噬、运动和细胞周期等潜在功能。富集分析显示,保守的 lncRNAs 的功能与它们在特定表达的组织中的生长和发育密切相关。
综合整合大规模 lncRNA 数据,并构建了来自 25 种开花植物的直系同源 lncRNA 家族的系统发育树,为植物 lncRNA 的进化历史提供了全面的概述,包括起源、保守性和直系同源关系。进一步的分析揭示了拟南芥中保守 lncRNA 与非保守 lncRNA 之间的差异特征谱。我们还研究了保守 lncRNA 的组织特异性表达和潜在的功能作用。这里呈现的结果将进一步加深我们对植物 lncRNA 进化的理解,并为进一步深入研究它们的功能提供基础。