Akkuratov Evgeny E, Walters Lorraine, Saha-Mandal Arnab, Khandekar Sushant, Crawford Erin, Zirbel Craig L, Leisner Scott, Prakash Ashwin, Fedorova Larisa, Fedorov Alexei
Faculty of Biology and Soil Science, St. Petersburg State University, St. Petersburg 199034, Russia; Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, OH 43614, USA.
Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, OH 43614, USA; Department of Bioengineering, University of Toledo, Main Campus, Toledo, OH 43606, USA.
Gene. 2014 Sep 10;548(1):81-90. doi: 10.1016/j.gene.2014.07.012. Epub 2014 Jul 8.
Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals.
直系同源内含子在不同物种的直系同源基因中相对于编码序列具有相同的位置。通过分析五种植物的完整基因组,我们生成了一个数据库,其中包含40512个双子叶植物直系同源内含子组、28519个被子植物直系同源内含子组以及15726个陆地植物(苔藓和被子植物)直系同源内含子组。使用Mafft算法获得了每个直系同源内含子组的多序列比对。植物内含子中的保守区域数量似乎比哺乳动物或脊椎动物中的少数百倍。特别是,被子植物和双子叶植物中约四分之三的保守内含子区域对应于可变剪接的外显子序列。我们仅记录了少数开花植物保守内含子非编码RNA。然而,在本研究中所检测的所有植物(包括苔藓)中普遍存在的最具进化保守性的内含子区域,具有tRNA的多种结构特征,这使我们将其归类为假定的tRNA样非编码RNA。编码tRNA样结构的内含子序列并非植物所特有。对内含子中tRNA存在情况的生物信息学检查揭示了鱼类、羊膜动物和哺乳动物的Vac14基因内四个甘氨酸tRNA存在异常长期的关联。