Bastolla Ugo, Porto Markus, Roman H Eduardo, Vendruscolo Michele
Centro de Biología Molecular Severo Ochoa, CSIC-UAM, Cantoblanco, 28049 Madrid, Spain.
BMC Evol Biol. 2006 May 31;6:43. doi: 10.1186/1471-2148-6-43.
Since thermodynamic stability is a global property of proteins that has to be conserved during evolution, the selective pressure at a given site of a protein sequence depends on the amino acids present at other sites. However, models of molecular evolution that aim at reconstructing the evolutionary history of macromolecules become computationally intractable if such correlations between sites are explicitly taken into account.
We introduce an evolutionary model with sites evolving independently under a global constraint on the conservation of structural stability. This model consists of a selection process, which depends on two hydrophobicity parameters that can be computed from protein sequences without any fit, and a mutation process for which we consider various models. It reproduces quantitatively the results of Structurally Constrained Neutral (SCN) simulations of protein evolution in which the stability of the native state is explicitly computed and conserved. We then compare the predicted site-specific amino acid distributions with those sampled from the Protein Data Bank (PDB). The parameters of the mutation model, whose number varies between zero and five, are fitted from the data. The mean correlation coefficient between predicted and observed site-specific amino acid distributions is larger than
The effective selection process that we propose reproduces well amino acid distributions as observed in the protein sequences in the PDB. Its simplicity makes it very promising for likelihood calculations in phylogenetic studies. Interestingly, in this approach the mutation process influences the effective selection process, i.e. selection and mutation must be entangled in order to obtain effectively independent sites. This interdependence between mutation and selection reflects the deep influence that mutation has on the evolutionary process: The bias in the mutation influences the thermodynamic properties of the evolving proteins, in agreement with comparative studies of bacterial proteomes, and it also influences the rate of accepted mutations.
由于热力学稳定性是蛋白质的一种全局属性,在进化过程中必须得以保留,因此蛋白质序列中给定位点的选择压力取决于其他位点所存在的氨基酸。然而,如果明确考虑位点之间的这种相关性,旨在重建大分子进化历史的分子进化模型在计算上就会变得难以处理。
我们引入了一种进化模型,其中位点在结构稳定性守恒的全局约束下独立进化。该模型由一个选择过程和一个突变过程组成,选择过程取决于两个可从蛋白质序列计算得出且无需任何拟合的疏水性参数,对于突变过程我们考虑了各种模型。它定量地再现了蛋白质进化的结构约束中性(SCN)模拟结果,在该模拟中明确计算并保留了天然状态的稳定性。然后我们将预测的位点特异性氨基酸分布与从蛋白质数据库(PDB)中采样得到的分布进行比较。突变模型的参数数量在零到五个之间,这些参数是根据数据进行拟合的。对于一个没有自由参数且没有遗传密码的突变模型,预测的和观察到的位点特异性氨基酸分布之间的平均相关系数大于
我们提出的有效选择过程能够很好地再现如在PDB中的蛋白质序列中所观察到的氨基酸分布。其简单性使其在系统发育研究中的似然计算方面非常有前景。有趣的是,在这种方法中,突变过程会影响有效选择过程,即选择和突变必须相互交织才能获得有效的独立位点。突变与选择之间的这种相互依赖性反映了突变对进化过程的深刻影响:突变偏差会影响正在进化的蛋白质的热力学性质,这与细菌蛋白质组的比较研究一致,并且它还会影响被接受的突变率。