Thomas P D, Dill K A
Graduate Group in Biophysics, University of California, San Franciso, 94143-0448, USA.
J Mol Biol. 1996 Mar 29;257(2):457-69. doi: 10.1006/jmbi.1996.0175.
"Statistical potentials" are energies widely used in computer algorithms to fold, dock, or recognize protein structures. They are derived from: (1) observed pairing frequencies of the 20 amino acids in databases of known protein structures, and (2) approximations and assumptions about the physical process that these quantities measure. Using exact lattice models, we construct a rigorous test of those assumptions and approximations. We find that statistical potentials often correctly rank-order the relative strengths of interresidue interactions, but they do not reflect the true underlying energies because of systematic errors arising from the neglect of excluded volume in proteins. We find that complex residue-residue distance dependences observed in statistical potentials, even those among charged groups, can be largely explained as an indirect consequence of the burial of non-polar groups. Our results suggest that current statistical potentials may have limited value in protein folding algorithms and wherever they are used to provide energy-like quantities.
“统计势”是计算机算法中广泛用于折叠、对接或识别蛋白质结构的能量。它们源自:(1)已知蛋白质结构数据库中20种氨基酸的观测配对频率,以及(2)对这些量所测量的物理过程的近似和假设。使用精确的晶格模型,我们对这些假设和近似进行了严格测试。我们发现统计势常常能正确地对残基间相互作用的相对强度进行排序,但由于忽略了蛋白质中的排除体积而产生的系统误差,它们并不能反映真正的潜在能量。我们发现,在统计势中观察到的复杂的残基-残基距离依赖性,即使是带电基团之间的依赖性,在很大程度上也可以解释为非极性基团埋藏的间接结果。我们的结果表明,当前的统计势在蛋白质折叠算法以及任何用于提供类似能量量值的地方可能价值有限。