Suppr超能文献

哺乳动物保守的长链非编码 RNA 的一个子集是祖先蛋白编码基因的化石。

A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes.

机构信息

Department of Biological Regulation, Weizmann Institute of Science, 234 Herzl St., Rehovot, 76100, Israel.

出版信息

Genome Biol. 2017 Aug 30;18(1):162. doi: 10.1186/s13059-017-1293-0.

Abstract

BACKGROUND

Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs.

RESULTS

We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality.

CONCLUSIONS

We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.

摘要

背景

只有一小部分人类长链非编码 RNA(lncRNA)似乎在哺乳动物之外具有保守性,但哺乳动物中新的 lncRNA 产生的事件在很大程度上仍然未知。一个潜在的来源是编码蛋白的基因的残余部分,这些基因已经转变为 lncRNA。

结果

我们系统地比较了脊椎动物中的 lncRNA 和编码蛋白的基因座,并估计多达 5%的保守哺乳动物 lncRNA 是由丢失的编码蛋白基因衍生而来的。这些 lncRNA 具有特定的特征,如更广泛的表达域,使它们与其他 lncRNA 区分开来。有 14 个 lncRNA 与丢失的编码蛋白基因的当代同源物的基因座具有序列相似性。我们提出,作用于增强子序列的选择是保留这些区域的主要原因。作为来自编码蛋白祖先的 RNA 元件的一个例子,它保留在 lncRNA 中,我们详细描述了 JPX lncRNA 中的一个短翻译 ORF,它源自编码蛋白基因的上游 ORF,并保留了其部分功能。

结论

我们估计大约有 55 个注释的保守人类 lncRNA 是从祖先编码蛋白基因的部分衍生而来的,因此丧失编码能力是新 lncRNA 的一个不可忽视的来源。一些 lncRNA 从其编码蛋白祖先那里继承了影响转录和翻译的调节元件,这些元件可以影响这些 lncRNA 的表达广度和功能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5067/5577775/288ecf40c2bf/13059_2017_1293_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验