Bolser Dan M, Filippis Ioannis, Stehr Henning, Duarte Jose, Lappe Michael
The Max Planck Institute for Molecular Genetics, Berlin, Germany.
BMC Struct Biol. 2008 Dec 8;8:53. doi: 10.1186/1472-6807-8-53.
For over 30 years potentials of mean force have been used to evaluate the relative energy of protein structures. The most commonly used potentials define the energy of residue-residue interactions and are derived from the empirical analysis of the known protein structures. However, single-body residue 'environment' potentials, although widely used in protein structure analysis, have not been rigorously compared to these classical two-body residue-residue interaction potentials. Here we do not try to combine the two different types of residue interaction potential, but rather to assess their independent contribution to scoring protein structures.
A data set of nearly three thousand monomers was used to compare pairwise residue-residue 'contact-type' propensities to single-body residue 'contact-count' propensities. Using a large and standard set of protein decoys we performed an in-depth comparison of these two types of residue interaction propensities. The scores derived from the contact-type and contact-count propensities were assessed using two different performance metrics and were compared using 90 different definitions of residue-residue contact. Our findings show that both types of score perform equally well on the task of discriminating between near-native protein decoys. However, in a statistical sense, the contact-count based scores were found to carry more information than the contact-type based scores.
Our analysis has shown that the performance of either type of score is very similar on a range of different decoys. This similarity suggests a common underlying biophysical principle for both types of residue interaction propensity. However, several features of the contact-count based propensity suggests that it should be used in preference to the contact-type based propensity. Specifically, it has been shown that contact-counts can be predicted from sequence information alone. In addition, the use of a single-body term allows for efficient alignment strategies using dynamic programming, which is useful for fold recognition, for example. These facts, combined with the relative simplicity of the contact-count propensity, suggests that contact-counts should be studied in more detail in the future.
30多年来,平均力势一直用于评估蛋白质结构的相对能量。最常用的势定义了残基-残基相互作用的能量,并且是从已知蛋白质结构的经验分析中得出的。然而,单体残基“环境”势尽管在蛋白质结构分析中广泛使用,但尚未与这些经典的两体残基-残基相互作用势进行严格比较。在这里,我们并非试图将两种不同类型的残基相互作用势结合起来,而是评估它们对蛋白质结构评分的独立贡献。
使用一个近三千个单体的数据集,将成对的残基-残基“接触类型”倾向与单体残基“接触计数”倾向进行比较。使用一大组标准的蛋白质诱饵,我们对这两种类型的残基相互作用倾向进行了深入比较。从接触类型和接触计数倾向得出的分数使用两种不同的性能指标进行评估,并使用90种不同的残基-残基接触定义进行比较。我们的研究结果表明,这两种类型的分数在区分近天然蛋白质诱饵的任务上表现同样出色。然而,从统计学意义上讲,发现基于接触计数的分数比基于接触类型的分数携带更多信息。
我们的分析表明,在一系列不同的诱饵上,任何一种类型分数的表现都非常相似。这种相似性表明两种类型的残基相互作用倾向存在共同的潜在生物物理原理。然而,基于接触计数倾向的几个特征表明,它应优先于基于接触类型的倾向使用。具体而言,已经表明仅从序列信息就可以预测接触计数。此外,使用单体项允许使用动态规划进行有效的比对策略,这例如对于折叠识别很有用。这些事实,再加上接触计数倾向相对简单,表明未来应更详细地研究接触计数。