Suppr超能文献

未剪接转录组的进化

Evolution of the unspliced transcriptome.

作者信息

Engelhardt Jan, Stadler Peter F

机构信息

Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Haertelstraße 16-18, Leipzig, D-04107, Germany.

Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, Leipzig, D-04103, Germany.

出版信息

BMC Evol Biol. 2015 Aug 20;15:166. doi: 10.1186/s12862-015-0437-7.

Abstract

BACKGROUND

Despite their abundance, unspliced EST data have received little attention as a source of information on non-coding RNAs. Very little is know, therefore, about the genomic distribution of unspliced non-coding transcripts and their relationship with the much better studied regularly spliced products. In particular, their evolution has remained virtually unstudied.

RESULTS

We systematically study the evidence on unspliced transcripts available in EST annotation tracks for human and mouse, comprising 104,980 and 66,109 unspliced EST clusters, respectively. Roughly one third of these are located totally inside introns of known genes (TINs) and another third overlaps exonic regions (PINs). Eleven percent are "intergenic", far away from any annotated gene. Direct evidence for the independent transcription of many PINs and TINs is obtained from CAGE tag and chromatin data. We predict more than 2000 3'UTR-associated RNA candidates for each human and mouse. Fifteen to twenty percent of the unspliced EST cluster are conserved between human and mouse. With the exception of TINs, the sequences of unspliced EST clusters evolve significantly slower than genomic background. Furthermore, like spliced lincRNAs, they show highly tissue-specific expression patterns.

CONCLUSIONS

Unspliced long non-coding RNAs are an important, rapidly evolving, component of mammalian transcriptomes. Their analysis is complicated by their preferential association with complex transcribed loci that usually also harbor a plethora of spliced transcripts. Unspliced EST data, although typically disregarded in transcriptome analysis, can be used to gain insights into this rarely investigated transcriptome component. The frequently postulated connection between lack of splicing and nuclear retention and the surprising overlap of chromatin-associated transcripts suggests that this class of transcripts might be involved in chromatin organization and possibly other mechanisms of epigenetic control.

摘要

背景

尽管未剪接的EST数据丰富,但作为非编码RNA的信息来源,它们很少受到关注。因此,人们对未剪接的非编码转录本的基因组分布及其与研究得更透彻的常规剪接产物之间的关系知之甚少。特别是,它们的进化几乎未被研究过。

结果

我们系统地研究了人类和小鼠EST注释轨迹中可用的未剪接转录本的证据,分别包括104,980个和66,109个未剪接的EST簇。其中大约三分之一完全位于已知基因的内含子内(TINs),另外三分之一与外显子区域重叠(PINs)。11%是“基因间的”,远离任何注释基因。许多PINs和TINs独立转录的直接证据来自CAGE标签和染色质数据。我们为人和小鼠各自预测了2000多个与3'UTR相关的RNA候选物。15%到20%的未剪接EST簇在人和小鼠之间是保守的。除了TINs,未剪接EST簇的序列进化速度明显慢于基因组背景。此外,与剪接的lincRNAs一样,它们表现出高度的组织特异性表达模式。

结论

未剪接的长非编码RNA是哺乳动物转录组的一个重要的、快速进化的组成部分。它们的分析因与复杂转录位点的优先关联而变得复杂,这些位点通常也含有大量的剪接转录本。未剪接的EST数据尽管在转录组分析中通常被忽视,但可用于深入了解这一很少被研究的转录组组成部分。缺乏剪接与核滞留之间经常被假定的联系以及与染色质相关转录本的惊人重叠表明,这类转录本可能参与染色质组织以及可能的其他表观遗传控制机制。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验