Stroud R M, Fauman E B
Department of Biochemistry and Biophysics, University of California-San Francisco 94143-0448, USA.
Protein Sci. 1995 Nov;4(11):2392-404. doi: 10.1002/pro.5560041118.
A quantitative expression key to evaluating significant structural differences or induced shifts between any two protein structures is derived. Because crystallography leads to reports of a single (or sometimes dual) position for each atom, the significance of any structural change based on comparison of two structures depends critically on knowing the expected precision of each median atomic position reported, and on extracting it for each atom, from the information provided in the Protein Data Bank and in the publication. The differences between structures of protein molecules that should be identical, and that are normally distributed, indicating that they are not affected by crystal contacts, were analyzed with respect to many potential indicators of structure precision, so as to extract, essentially by "machine learning" principles, a generally applicable expression involving the highest correlates. Eighteen refined crystal structures from the Protein Data Bank, in which there are multiple molecules in the crystallographic asymmetric unit, were selected and compared. The thermal B factor, the connectivity of the atom, and the ratio of the number of reflections to the number of atoms used in refinement correlate best with the magnitude of the positional differences between regions of the structures that otherwise would be expected to be the same. These results are embodied in a six-parameter equation that can be applied to any crystallographically refined structure to estimate the expected uncertainty in position of each atom. Structure change in a macromolecule can thus be referenced to the expected uncertainty in atomic position as reflected in the variance between otherwise identical structures with the observed values of correlated parameters.
得出了一个用于评估任意两个蛋白质结构之间显著结构差异或诱导位移的定量表达式关键。由于晶体学给出的是每个原子的单一(或有时是双重)位置报告,基于两个结构比较的任何结构变化的显著性关键取决于了解所报告的每个原子中位位置的预期精度,并从蛋白质数据库和出版物中提供的信息中为每个原子提取该精度。分析了应该相同且呈正态分布的蛋白质分子结构之间的差异,这些差异表明它们不受晶体接触的影响,针对许多结构精度的潜在指标进行分析,以便基本上通过“机器学习”原理提取一个涉及最高相关性的通用表达式。从蛋白质数据库中选择了18个精制晶体结构,其中晶体学不对称单元中有多个分子并进行了比较。热B因子、原子的连通性以及用于精修的反射数与原子数之比与结构中其他预期相同区域之间的位置差异大小相关性最佳。这些结果体现在一个六参数方程中,该方程可应用于任何晶体学精制结构,以估计每个原子位置的预期不确定性。因此,大分子中的结构变化可以参照原子位置的预期不确定性,这反映在具有相关参数观测值的其他相同结构之间的方差中。