Palacios Carmen, Wernegreen Jennifer J
Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts 02543, USA.
Mol Biol Evol. 2002 Sep;19(9):1575-84. doi: 10.1093/oxfordjournals.molbev.a004219.
The advent of full genome sequences provides exceptionally rich data sets to explore molecular and evolutionary mechanisms that shape divergence among and within genomes. In this study, we use multivariate analysis to determine the processes driving genome-wide patterns of amino usage in the obligate endosymbiont Buchnera and its close free-living relative Escherichia coli. In the AT-rich Buchnera genome, the primary source of variation in amino acid usage differentiates high- and low-expression genes. Amino acids of high-expression Buchnera genes are generally less aromatic and use relatively GC-rich codons, suggesting that selection against aromatic amino acids and against amino acids with AT-rich codons is stronger in high-expression genes. Selection to maintain hydrophobic amino acids in integral membrane proteins is a primary factor driving protein evolution in E. coli but is a secondary factor in Buchnera. In E. coli, gene expression is a secondary force driving amino acid usage, and a correlation with tRNA abundance suggests that translational selection contributes to this effect. Although this and previous studies demonstrate that AT mutational bias and genetic drift influence amino acid usage in Buchnera, this genome-wide analysis argues that selection is sufficient to affect the amino acid content of proteins with different expression and hydropathy levels.
全基因组序列的出现提供了极为丰富的数据集,可用于探索塑造基因组间和基因组内差异的分子和进化机制。在本研究中,我们使用多变量分析来确定驱动专性内共生菌布赫纳氏菌及其亲缘关系相近的自由生活菌大肠杆菌全基因组氨基酸使用模式的过程。在富含AT的布赫纳氏菌基因组中,氨基酸使用的主要变异来源区分了高表达基因和低表达基因。布赫纳氏菌高表达基因的氨基酸通常芳香性较低,且使用相对富含GC的密码子,这表明在高表达基因中,针对芳香族氨基酸和具有富含AT密码子的氨基酸的选择更强。在大肠杆菌中,维持整合膜蛋白中疏水氨基酸的选择是驱动蛋白质进化的主要因素,但在布赫纳氏菌中是次要因素。在大肠杆菌中,基因表达是驱动氨基酸使用的次要力量,与tRNA丰度的相关性表明翻译选择促成了这种效应。尽管本研究及之前的研究表明,AT突变偏好和遗传漂变会影响布赫纳氏菌的氨基酸使用,但这种全基因组分析表明,选择足以影响具有不同表达水平和亲水性水平的蛋白质的氨基酸含量。