Lopez-Ezquerra Alberto, Harrison Mark C, Bornberg-Bauer Erich
Institute of Evolution and Biodiversity, University of Münster, Hüfferstrasse,1, Münster, Münster, Germany.
BMC Evol Biol. 2017 Jul 3;17(1):155. doi: 10.1186/s12862-017-0985-0.
The ever increasing availability of genomes makes it possible to investigate and compare not only the genomic complements of genes and proteins, but also of RNAs. One class of RNAs, the long noncoding RNAs (lncRNAs) and, in particular, their subclass of long intergenic noncoding RNAs (lincRNAs) have recently gained much attention because of their roles in regulation of important biological processes such as immune response or cell differentiation and as possible evolutionary precursors for protein coding genes. lincRNAs seem to be poorly conserved at the sequence level but at least some lincRNAs have conserved structural elements and syntenic genomic positions. Previous studies showed that transposable elements are a main contribution to the evolution of lincRNAs in mammals. In contrast, plant lincRNA emergence and evolution has been linked with local duplication events. However, little is known about their evolutionary dynamics in general and in insect genomes in particular.
Here we compared lincRNAs between seven insect genomes and investigated possible evolutionary changes and functional roles. We find very low sequence conservation between different species and that similarities within a species are mostly due to their association with transposable elements (TE) and simple repeats. Furthermore, we find that TEs are less frequent in lincRNA exons than in their introns, indicating that TEs may have been removed by selection. When we analysed the predicted thermodynamic stabilities of lincRNAs we found that they are more stable than their randomized controls which might indicate some selection pressure to maintain certain structural elements. We list several of the most stable lincRNAs which could serve as prime candidates for future functional studies. We also discuss the possibility of de novo protein coding genes emerging from lincRNAs. This is because lincRNAs with high GC content and potentially with longer open reading frames (ORF) are candidate loci where de novo gene emergence might occur.
The processes responsible for the emergence and diversification of lincRNAs in insects remain unclear. Both duplication and transposable elements may be important for the creation of new lincRNAs in insects.
基因组的可得性不断增加,使得不仅能够研究和比较基因、蛋白质的基因组组成,还能研究和比较RNA的基因组组成。一类RNA,即长链非编码RNA(lncRNA),尤其是其中的长链基因间非编码RNA(lincRNA)亚类,由于其在免疫反应或细胞分化等重要生物学过程调控中的作用,以及作为蛋白质编码基因可能的进化前体,最近受到了广泛关注。lincRNA在序列水平上似乎保守性较差,但至少一些lincRNA具有保守的结构元件和同线基因组位置。先前的研究表明,转座元件是哺乳动物lincRNA进化的主要贡献因素。相比之下,植物lincRNA的出现和进化与局部重复事件有关。然而,总体而言,人们对其进化动态知之甚少,尤其是在昆虫基因组中。
在此,我们比较了7个昆虫基因组中的lincRNA,并研究了可能的进化变化和功能作用。我们发现不同物种之间的序列保守性非常低,且同一物种内的相似性大多归因于它们与转座元件(TE)和简单重复序列的关联。此外,我们发现TE在lincRNA外显子中的出现频率低于其内含子,这表明TE可能已通过选择被去除。当我们分析lincRNA预测的热力学稳定性时,发现它们比随机对照更稳定,这可能表明存在一些选择压力以维持某些结构元件。我们列出了几个最稳定的lincRNA,它们可作为未来功能研究的主要候选对象。我们还讨论了lincRNA产生新的蛋白质编码基因的可能性。这是因为具有高GC含量且可能具有更长开放阅读框(ORF)的lincRNA是可能发生新基因产生的候选位点。
昆虫中lincRNA出现和多样化的过程仍不清楚。重复和转座元件可能对昆虫中新lincRNA的产生都很重要。