Département de Biochimie, Centre Robert Cedergren, Université de Montréal, Montreal, Quebec, Canada.
Mol Biol Evol. 2010 Jul;27(7):1546-60. doi: 10.1093/molbev/msq047. Epub 2010 Feb 16.
Assessing the influence of three-dimensional protein structure on sequence evolution is a difficult task, mainly because of the assumption of independence between sites required by probabilistic phylogenetic methods. Recently, models that include an explicit treatment of protein structure and site interdependencies have been developed: a statistical potential (an energy-like scoring system for sequence-structure compatibility) is used to evaluate the probability of fixation of a given mutation, assuming a coarse-grained protein structure that is constant through evolution. Yet, due to the novelty of these models and the small degree of overlap between the fields of structural and evolutionary biology, only simple representations of protein structure have been used so far. In this work, we present new forms of statistical potentials using a probabilistic framework recently developed for evolutionary studies. Terms related to pairwise distance interactions, torsion angles, solvent accessibility, and flexibility of the residues are included in the potentials, so as to study the effects of the main factors known to influence protein structure. The new potentials, with a more detailed representation of the protein structure, yield a better fit than the previously used scoring functions, with pairwise interactions contributing to more than half of this improvement. In a phylogenetic context, however, the structurally constrained models are still outperformed by some of the available site-independent models in terms of fit, possibly indicating that alternatives to coarse-grained statistical potentials should be explored in order to better model structural constraints.
评估三维蛋白质结构对序列进化的影响是一项艰巨的任务,主要是因为概率系统发育方法需要站点之间独立的假设。最近,已经开发出了一些包含蛋白质结构和站点相关性的显式处理的模型:统计势(一种用于序列-结构兼容性的能量样评分系统)用于评估给定突变固定的概率,假设蛋白质结构在进化过程中是恒定的。然而,由于这些模型的新颖性以及结构生物学和进化生物学领域之间的重叠程度很小,迄今为止仅使用了简单的蛋白质结构表示形式。在这项工作中,我们使用最近为进化研究开发的概率框架提出了新的统计势形式。势中包含了与成对距离相互作用、扭转角、溶剂可及性和残基柔韧性相关的术语,以研究已知影响蛋白质结构的主要因素的影响。具有更详细蛋白质结构表示形式的新势比以前使用的评分函数具有更好的拟合度,其中成对相互作用对这种改进的贡献超过一半。然而,在系统发育背景下,在拟合度方面,结构约束模型仍然逊于一些可用的无站点独立模型,这可能表明需要探索替代粗糙统计势的方法,以更好地模拟结构约束。