Nitsche Anne, Rose Dominic, Fasold Mario, Reiche Kristin, Stadler Peter F
Bioinformatics Group, Department of Computer Science, University of Leipzig, D-04107 Leipzig, Germany Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany.
Bioinformatics Group, Department of Computer Science, University of Freiburg, D-79110 Freiburg, Germany MML, Munich Leukemia Laboratory GmbH, D-81377 München, Germany.
RNA. 2015 May;21(5):801-12. doi: 10.1261/rna.046342.114. Epub 2015 Mar 23.
Large-scale RNA sequencing has revealed a large number of long mRNA-like transcripts (lncRNAs) that do not code for proteins. The evolutionary history of these lncRNAs has been notoriously hard to study systematically due to their low level of sequence conservation that precludes comprehensive homology-based surveys and makes them nearly impossible to align. An increasing number of special cases, however, has been shown to be at least as old as the vertebrate lineage. Here we use the conservation of splice sites to trace the evolution of lncRNAs. We show that >85% of the human GENCODE lncRNAs were already present at the divergence of placental mammals and many hundreds of these RNAs date back even further. Nevertheless, we observe a fast turnover of intron/exon structures. We conclude that lncRNA genes are evolutionary ancient components of vertebrate genomes that show an unexpected and unprecedented evolutionary plasticity. We offer a public web service (http://splicemap.bioinf.uni-leipzig.de) that allows to retrieve sets of orthologous splice sites and to produce overview maps of evolutionarily conserved splice sites for visualization and further analysis. An electronic supplement containing the ncRNA data sets used in this study is available at http://www.bioinf.uni-leipzig.de/publications/supplements/12-001.
大规模RNA测序揭示了大量不编码蛋白质的长链mRNA样转录本(lncRNAs)。由于这些lncRNAs的序列保守性水平较低,难以进行基于全面同源性的系统研究,也几乎无法进行比对,因此其进化历史一直难以系统地研究。然而,越来越多的特殊情况表明,这些lncRNAs至少与脊椎动物谱系一样古老。在这里,我们利用剪接位点的保守性来追踪lncRNAs的进化。我们发现,超过85%的人类GENCODE lncRNAs在胎盘哺乳动物分化时就已存在,其中数百种RNA的起源甚至更早。尽管如此,我们观察到内含子/外显子结构的快速更替。我们得出结论,lncRNA基因是脊椎动物基因组中进化上古老的组成部分,表现出意想不到且前所未有的进化可塑性。我们提供了一个公共网络服务(http://splicemap.bioinf.uni-leipzig.de),该服务允许检索直系同源剪接位点集,并生成进化保守剪接位点的概述图,以便进行可视化和进一步分析。本研究中使用的ncRNA数据集的电子补充材料可在http://www.bioinf.uni-leipzig.de/publications/supplements/12-001获取。