Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
PLoS One. 2010 Nov 10;5(11):e13714. doi: 10.1371/journal.pone.0013714.
Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge-based potentials based on pairwise distances--so-called "potentials of mean force" (PMFs)--have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state--a necessary component of these potentials--is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities "reference ratio distributions" deriving from the application of the "reference ratio method." This new view is not only of theoretical relevance but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures.
理解蛋白质结构在科学、医学和生物技术中至关重要。大约二十年来,基于成对距离的知识型势能——所谓的“平均力势”(PMFs)——一直是预测和设计蛋白质结构以及模拟蛋白质折叠的核心。然而,这些势能的有效性、范围和局限性仍在激烈争论和争议中,而且这些势能的参考状态(必要组成部分)的最佳选择也是一个未解决的问题。PMFs 是通过类比统计物理学中的可逆功定理或基于似然函数的统计论证来合理推断的。这两种推理都很有见地,但仍有许多问题没有答案。在这里,我们首次表明,PMFs 可以被视为具有严格概率依据的量的近似值:当需要组合蛋白质不同特征的概率分布时,它们自然会出现。我们将这些量称为“参考比分布”,它们源自“参考比方法”的应用。这种新观点不仅具有理论意义,而且还带来了许多直接实用的见解:参考状态是唯一定义的,不需要外部物理洞察力;该方法可以推广到任意蛋白质结构特征,而不仅仅是成对距离;对于哪些目的使用这些量是合理的,也变得清晰了。我们用两个应用程序来说明这些见解,涉及旋转半径和氢键。在后一种情况下,我们还展示了如何迭代应用参考比方法来塑造能量漏斗。我们的结果大大提高了从已知生物分子结构中得出的能量函数的理解和范围。