Belshaw Robert, Pybus Oliver G, Rambaut Andrew
Department of Zoology, University of Oxford, Oxford OX1 3PS, United Kingdom.
Genome Res. 2007 Oct;17(10):1496-504. doi: 10.1101/gr.6305707. Epub 2007 Sep 4.
The genomes of RNA viruses are characterized by their extremely small size and extremely high mutation rates (typically 10 kb and 10(-4)/base/replication cycle, respectively), traits that are thought to be causally linked. One aspect of their small size is the genome compression caused by the use of overlapping genes (where some nucleotides code for two genes). Using a comparative analysis of all known RNA viral species, we show that viruses with larger genomes tend to have less gene overlap. We provide a numerical model to show how a high mutation rate could lead to gene overlap, and we discuss the factors that might explain the observed relationship between gene overlap and genome size. We also propose a model for the evolution of gene overlap based on the co-opting of previously unused ORFs, which gives rise to two types of overlap: (1) the creation of novel genes inside older genes, predominantly via +1 frameshifts, and (2) the incremental increase in overlap between originally contiguous genes, with no frameshift preference. Both types of overlap are viewed as the creation of genomic novelty under pressure for genome compression. Simulations based on our model generate the empirical size distributions of overlaps and explain the observed frameshift preferences. We suggest that RNA viruses are a good model system for the investigation of general evolutionary relationship between genome attributes such as mutational robustness, mutation rate, and size.
RNA病毒的基因组具有极小的尺寸和极高的突变率(通常分别为10 kb和10(-4)/碱基/复制周期),人们认为这些特征之间存在因果联系。其小尺寸的一个方面是由于使用重叠基因(一些核苷酸编码两个基因)导致的基因组压缩。通过对所有已知RNA病毒物种的比较分析,我们发现基因组较大的病毒往往基因重叠较少。我们提供了一个数值模型来展示高突变率如何导致基因重叠,并讨论了可能解释观察到的基因重叠与基因组大小之间关系的因素。我们还基于对先前未使用的开放阅读框的征用提出了一个基因重叠进化模型,该模型产生了两种类型的重叠:(1)主要通过+1移码在较老的基因内部创建新基因,以及(2)原本相邻基因之间重叠的逐渐增加,且无移码偏好。这两种类型的重叠都被视为在基因组压缩压力下产生基因组新特性的过程。基于我们模型的模拟生成了重叠的经验大小分布,并解释了观察到的移码偏好。我们认为RNA病毒是研究基因组属性(如突变稳健性、突变率和大小)之间一般进化关系的良好模型系统。