Pascal Géraldine, Médigue Claudine, Danchin Antoine
Genoscope/CNRS UMR 8030, Atelier de Génomique Comparative, Evry, France.
Proteins. 2005 Jul 1;60(1):27-35. doi: 10.1002/prot.20475.
The levels of cellular organization in living organisms are the results of a variety of selection pressures. We have investigated here the final outcome of this integrated selective process in proteins of the best known microbial models Escherichia coli, Bacillus subtilis, and Methanococcus jannaschii, supposed to have undergone separate evolution for more than 1 billion years. Using multivariate analysis methods, including correspondence analysis, we studied the overall amino acid composition of all proteins making a proteome. Starting from and further developing previous results that had pointed out some general forces driving the amino acid composition of the proteomes of these model bacteria, we explored the correlations existing between the structure and functions of the proteins forming a proteome and their amino acid composition. The electric charge of amino acids measured against hydrophobicity creates a highly homogeneous cluster, made exclusively of proteins that are core components of the cytoplasmic membrane of the cell (integral inner membrane proteins). A second bias is imposed by the G+C content of the genome, indicating that protein functions are so robust with respect to amino acid changes that they can accommodate a large shift in the nucleotide content of the genome. A remarkable role of aromatic amino acids was uncovered. Expressed orphan proteins are enriched in these residues, suggesting that they might participate in a process of gain of function during evolution.
生物体中的细胞组织水平是多种选择压力的结果。我们在此研究了这一综合选择过程在最著名的微生物模型大肠杆菌、枯草芽孢杆菌和詹氏甲烷球菌蛋白质中的最终结果,这些微生物被认为已经独立进化了超过10亿年。我们使用包括对应分析在内的多变量分析方法,研究了构成蛋白质组的所有蛋白质的整体氨基酸组成。从先前指出驱动这些模型细菌蛋白质组氨基酸组成的一些一般力量的结果出发并进一步发展,我们探索了构成蛋白质组的蛋白质的结构和功能与其氨基酸组成之间存在的相关性。根据疏水性测量的氨基酸电荷形成了一个高度均匀的簇,该簇仅由细胞细胞质膜的核心成分蛋白质(整合内膜蛋白)组成。第二个偏差是由基因组的G+C含量造成的,这表明蛋白质功能对于氨基酸变化非常稳健,以至于它们能够适应基因组核苷酸含量的大幅变化。我们发现了芳香族氨基酸的显著作用。表达的孤儿蛋白富含这些残基,这表明它们可能在进化过程中参与了功能获得过程。