Suppr超能文献

源于长非编码 RNA 的人科特异性从头蛋白编码基因。

Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs.

机构信息

Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Peking University, Beijing, China.

出版信息

PLoS Genet. 2012 Sep;8(9):e1002942. doi: 10.1371/journal.pgen.1002942. Epub 2012 Sep 13.

Abstract

Tinkering with pre-existing genes has long been known as a major way to create new genes. Recently, however, motherless protein-coding genes have been found to have emerged de novo from ancestral non-coding DNAs. How these genes originated is not well addressed to date. Here we identified 24 hominoid-specific de novo protein-coding genes with precise origination timing in vertebrate phylogeny. Strand-specific RNA-Seq analyses were performed in five rhesus macaque tissues (liver, prefrontal cortex, skeletal muscle, adipose, and testis), which were then integrated with public transcriptome data from human, chimpanzee, and rhesus macaque. On the basis of comparing the RNA expression profiles in the three species, we found that most of the hominoid-specific de novo protein-coding genes encoded polyadenylated non-coding RNAs in rhesus macaque or chimpanzee with a similar transcript structure and correlated tissue expression profile. According to the rule of parsimony, the majority of these hominoid-specific de novo protein-coding genes appear to have acquired a regulated transcript structure and expression profile before acquiring coding potential. Interestingly, although the expression profile was largely correlated, the coding genes in human often showed higher transcriptional abundance than their non-coding counterparts in rhesus macaque. The major findings we report in this manuscript are robust and insensitive to the parameters used in the identification and analysis of de novo genes. Our results suggest that at least a portion of long non-coding RNAs, especially those with active and regulated transcription, may serve as a birth pool for protein-coding genes, which are then further optimized at the transcriptional level.

摘要

长期以来,人们一直认为对现有基因进行 tinkering 是创造新基因的主要方法。然而,最近发现,没有母亲的蛋白质编码基因已经从祖先的非编码 DNA 中全新出现。这些基因是如何起源的至今还没有得到很好的解决。在这里,我们在脊椎动物系统发育中鉴定了 24 种人科特异性从头蛋白质编码基因,它们具有精确的起源时间。在 5 种猕猴组织(肝、前额皮质、骨骼肌、脂肪和睾丸)中进行了链特异性 RNA-Seq 分析,然后将其与来自人类、黑猩猩和猕猴的公共转录组数据进行整合。基于比较这三个物种的 RNA 表达谱,我们发现人科特异性从头蛋白质编码基因中的大多数在猕猴或黑猩猩中编码多聚腺苷酸化非编码 RNA,其转录结构和相关组织表达谱相似。根据简约性原则,这些人科特异性从头蛋白质编码基因中的大多数似乎在获得编码潜力之前获得了调节转录结构和表达谱。有趣的是,尽管表达谱具有很大的相关性,但人类中的编码基因在猕猴中的转录丰度通常高于其非编码基因。我们在本文中报告的主要发现是稳健的,并且不受鉴定和分析从头基因中使用的参数的影响。我们的结果表明,至少一部分长非编码 RNA,特别是那些具有活跃和调节转录的 RNA,可能作为蛋白质编码基因的起源池,然后在转录水平上进一步优化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45ea/3441637/730fed694af1/pgen.1002942.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验