Terwilliger Thomas C, Grosse-Kunstleve Ralf W, Afonine Pavel V, Adams Paul D, Moriarty Nigel W, Zwart Peter, Read Randy J, Turk Dusan, Hung Li Wei
Los Alamos National Laboratory, Mailstop M888, Los Alamos, NM 87545, USA.
Acta Crystallogr D Biol Crystallogr. 2007 May;63(Pt 5):597-610. doi: 10.1107/S0907444907009791. Epub 2007 Apr 21.
Automation of iterative model building, density modification and refinement in macromolecular crystallography has made it feasible to carry out this entire process multiple times. By using different random seeds in the process, a number of different models compatible with experimental data can be created. Sets of models were generated in this way using real data for ten protein structures from the Protein Data Bank and using synthetic data generated at various resolutions. Most of the heterogeneity among models produced in this way is in the side chains and loops on the protein surface. Possible interpretations of the variation among models created by repetitive rebuilding were investigated. Synthetic data were created in which a crystal structure was modelled as the average of a set of ;perfect' structures and the range of models obtained by rebuilding a single starting model was examined. The standard deviations of coordinates in models obtained by repetitive rebuilding at high resolution are small, while those obtained for the same synthetic crystal structure at low resolution are large, so that the diversity within a group of models cannot generally be a quantitative reflection of the actual structures in a crystal. Instead, the group of structures obtained by repetitive rebuilding reflects the precision of the models, and the standard deviation of coordinates of these structures is a lower bound estimate of the uncertainty in coordinates of the individual models.
在大分子晶体学中,迭代模型构建、密度修正和精修的自动化使得多次执行整个过程成为可能。通过在该过程中使用不同的随机种子,可以创建许多与实验数据兼容的不同模型。使用来自蛋白质数据库的十个蛋白质结构的真实数据以及以各种分辨率生成的合成数据,以这种方式生成了多组模型。以这种方式产生的模型之间的大多数异质性存在于蛋白质表面的侧链和环中。研究了对通过重复重建创建的模型之间的差异的可能解释。创建了合成数据,其中将晶体结构建模为一组“完美”结构的平均值,并检查了通过重建单个起始模型获得的模型范围。在高分辨率下通过重复重建获得的模型中坐标的标准偏差较小,而在低分辨率下对相同合成晶体结构获得的标准偏差较大,因此一组模型内的多样性通常不能定量反映晶体中的实际结构。相反,通过重复重建获得的一组结构反映了模型的精度,并且这些结构坐标的标准偏差是单个模型坐标不确定性的下限估计。