Bronson E C, Anderson J N
Department of Biological Sciences, Purdue University, West Lafayette, IN 47907.
J Mol Evol. 1994 May;38(5):506-32. doi: 10.1007/BF00178851.
All complete retrovirus sequences in the GenEMBL database were examined with the goal of assessing possible relationships between the nucleotide composition of retroviral genomes, the amino acid composition of retroviral proteins, and evolutionary strategies used by retroviruses. The results demonstrated that the genome of each viral lineage has a characteristic base composition and that the variations between groups are related to retroviral phylogeny. By analogy to microbial species, we suggest that the variations arise from group-specific patterns of directional mutations where the bias can be exerted on any of the four nucleotides. It is most likely that the mutational patterns are introduced during reverse transcription, and a direct participation of reverse transcriptase in the process is suspected. A straightforward strategy was used to analyze the compositional relationship between nucleotides and encoded amino acids. The procedure entailed calculations of amino acid frequencies from nucleotide content and the comparison of the calculated values to the observed amino acid frequencies in retroviruses. The results revealed an excellent correspondence between variation in genomic base composition and variation in amino acid composition of proteins with the compositional differences extending into all major coding regions of the viruses. Because of the magnitude and dispersion of these effects, and because of the nonconservative nature of many of the substitutions between groups with different genomic biases, we suggest that the variations in protein composition driven by biased nucleotide frequencies are an important factor in shaping the characteristic phenotypes of the different viral lineages. A clue to the nature of the evolutionary forces that are responsible for the generation of nucleotide biases was provided by the observation that viruses with radically different base frequencies most often inhabit the same cell type. This observation, along with analysis of amino acid and nucleotide replacement patterns between and within reverse transcriptase sequences from the various groups, permitted us to advance a model for the evolution of retroviruses. According to the model, speciation could initiate when daughter virions from a single progenitor vary in the direction of their mutational bias. These variations would exert a pleiotropic effect on the frequencies of nucleotides in all viral genes and consequently on the frequencies of amino acids in the encoded proteins. The variants with the most extreme compositional differences would have a selective advantage because their different precursor requirements would enable them to occupy different ecological niches within a single cell.(ABSTRACT TRUNCATED AT 400 WORDS)
对GenEMBL数据库中的所有完整逆转录病毒序列进行了研究,目的是评估逆转录病毒基因组的核苷酸组成、逆转录病毒蛋白质的氨基酸组成以及逆转录病毒所采用的进化策略之间可能存在的关系。结果表明,每个病毒谱系的基因组都有其特征性的碱基组成,且不同组之间的差异与逆转录病毒的系统发育有关。类比微生物物种,我们认为这些差异源于特定组的定向突变模式,其中偏差可作用于四种核苷酸中的任何一种。最有可能的是,突变模式是在逆转录过程中引入的,并且怀疑逆转录酶直接参与了这一过程。我们采用了一种简单的策略来分析核苷酸与编码氨基酸之间的组成关系。该过程包括根据核苷酸含量计算氨基酸频率,并将计算值与逆转录病毒中观察到的氨基酸频率进行比较。结果显示,基因组碱基组成的变化与蛋白质氨基酸组成的变化之间存在极好的对应关系,组成差异延伸到病毒的所有主要编码区域。由于这些效应的大小和分散性,以及不同基因组偏差组之间许多替换的非保守性质,我们认为由偏差核苷酸频率驱动的蛋白质组成变化是塑造不同病毒谱系特征表型的一个重要因素。观察到碱基频率截然不同的病毒最常寄生于同一细胞类型,这为导致核苷酸偏差产生的进化力量的本质提供了线索。这一观察结果,连同对来自不同组的逆转录酶序列之间以及内部氨基酸和核苷酸替换模式的分析,使我们能够提出一个逆转录病毒进化模型。根据该模型,当来自单个祖细胞的子代病毒粒子在其突变偏差方向上发生变化时,物种形成可能开始。这些变化将对所有病毒基因中的核苷酸频率产生多效性影响,从而对编码蛋白质中的氨基酸频率产生影响。组成差异最极端的变体将具有选择优势,因为它们不同的前体需求将使它们能够在单个细胞内占据不同的生态位。(摘要截选至400词)