Suppr超能文献

重叠基因产生具有异常序列特性的蛋白质,并为从头蛋白质创造提供了见解。

Overlapping genes produce proteins with unusual sequence properties and offer insight into de novo protein creation.

作者信息

Rancurel Corinne, Khosravi Mahvash, Dunker A Keith, Romero Pedro R, Karlin David

机构信息

Architecture et Fonction des Macromolécules Biologiques, Case 932, Campus de Luminy, 13288 Marseille Cedex 9, France.

出版信息

J Virol. 2009 Oct;83(20):10719-36. doi: 10.1128/JVI.00595-09. Epub 2009 Jul 29.

Abstract

It is widely assumed that new proteins are created by duplication, fusion, or fission of existing coding sequences. Another mechanism of protein birth is provided by overlapping genes. They are created de novo by mutations within a coding sequence that lead to the expression of a novel protein in another reading frame, a process called "overprinting." To investigate this mechanism, we have analyzed the sequences of the protein products of manually curated overlapping genes from 43 genera of unspliced RNA viruses infecting eukaryotes. Overlapping proteins have a sequence composition globally biased toward disorder-promoting amino acids and are predicted to contain significantly more structural disorder than nonoverlapping proteins. By analyzing the phylogenetic distribution of overlapping proteins, we were able to confirm that 17 of these had been created de novo and to study them individually. Most proteins created de novo are orphans (i.e., restricted to one species or genus). Almost all are accessory proteins that play a role in viral pathogenicity or spread, rather than proteins central to viral replication or structure. Most proteins created de novo are predicted to be fully disordered and have a highly unusual sequence composition. This suggests that some viral overlapping reading frames encoding hypothetical proteins with highly biased composition, often discarded as noncoding, might in fact encode proteins. Some proteins created de novo are predicted to be ordered, however, and whenever a three-dimensional structure of such a protein has been solved, it corresponds to a fold previously unobserved, suggesting that the study of these proteins could enhance our knowledge of protein space.

摘要

人们普遍认为,新蛋白质是通过现有编码序列的复制、融合或裂变产生的。蛋白质产生的另一种机制是由重叠基因提供的。它们是由编码序列内的突变从头产生的,这些突变导致在另一个阅读框中表达一种新的蛋白质,这一过程称为“套印”。为了研究这一机制,我们分析了来自43个感染真核生物的未剪接RNA病毒属的人工筛选的重叠基因的蛋白质产物序列。重叠蛋白的序列组成在整体上偏向于促进无序的氨基酸,并且预计比非重叠蛋白含有更多的结构无序。通过分析重叠蛋白的系统发育分布,我们能够确认其中17个是从头产生的,并对它们进行单独研究。大多数从头产生的蛋白质是孤儿蛋白(即仅限于一个物种或属)。几乎所有这些都是在病毒致病性或传播中起作用的辅助蛋白,而不是病毒复制或结构核心的蛋白。大多数从头产生的蛋白质预计是完全无序的,并且具有非常不寻常的序列组成。这表明一些编码具有高度偏向组成的假设蛋白的病毒重叠阅读框,通常被当作非编码序列而被丢弃,实际上可能编码蛋白质。然而,一些从头产生的蛋白质预计是有序的,每当解析出这种蛋白质的三维结构时,它都对应于一种以前未观察到的折叠,这表明对这些蛋白质的研究可以增进我们对蛋白质空间的了解。

相似文献

2
Gene Birth Contributes to Structural Disorder Encoded by Overlapping Genes.基因诞生导致重叠基因编码的结构无序。
Genetics. 2018 Sep;210(1):303-313. doi: 10.1534/genetics.118.301249. Epub 2018 Jul 19.
3
Evolution of viral proteins originated de novo by overprinting.新起源的病毒蛋白通过重迭进化。
Mol Biol Evol. 2012 Dec;29(12):3767-80. doi: 10.1093/molbev/mss179. Epub 2012 Jul 19.
9
Origins of genes: "big bang" or continuous creation?基因的起源:“大爆炸”还是持续创造?
Proc Natl Acad Sci U S A. 1992 Oct 15;89(20):9489-93. doi: 10.1073/pnas.89.20.9489.

引用本文的文献

6
Three Novel Antisense Overlapping Genes in E. coli O157:H7 EDL933.大肠杆菌 O157:H7 EDL933 中的三个新的反义重叠基因。
Microbiol Spectr. 2023 Feb 14;11(1):e0235122. doi: 10.1128/spectrum.02351-22. Epub 2022 Dec 19.
7
Viral Complexity.病毒复杂性。
Biomolecules. 2022 Jul 30;12(8):1061. doi: 10.3390/biom12081061.
10
Gene Overlapping as a Modulator of Evolution.基因重叠作为进化的一种调节因素。
Microorganisms. 2022 Feb 4;10(2):366. doi: 10.3390/microorganisms10020366.

本文引用的文献

2
Structure of flexible filamentous plant viruses.柔性丝状植物病毒的结构。
J Virol. 2008 Oct;82(19):9546-54. doi: 10.1128/JVI.00895-08. Epub 2008 Jul 30.
3
On the origin of new genes in Drosophila.论果蝇中新基因的起源
Genome Res. 2008 Sep;18(9):1446-55. doi: 10.1101/gr.076588.108. Epub 2008 Jun 11.
5
The twilight zone between protein order and disorder.蛋白质有序与无序之间的过渡区域。
Biophys J. 2008 Aug;95(4):1612-26. doi: 10.1529/biophysj.108.131151. Epub 2008 Apr 25.
10
The Pfam protein families database.Pfam蛋白质家族数据库。
Nucleic Acids Res. 2008 Jan;36(Database issue):D281-8. doi: 10.1093/nar/gkm960. Epub 2007 Nov 26.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验