Maiorov V N, Crippen G M
College of Pharmacy, University of Michigan, Ann Arbor 48109.
J Mol Biol. 1994 Jan 14;235(2):625-34. doi: 10.1006/jmbi.1994.1017.
In the study of globular protein conformations, one customarily measures the similarity in three-dimensional structure by the root-mean-square deviation (RMSD) of the C alpha atomic coordinates after optimal rigid body superposition. Even when the two protein structures each consist of a single chain having the same number of residues so that the matching of C alpha atoms is obvious, it is not clear how to interpret the RMSD. A very large value means they are dissimilar, and zero means they are identical in conformation, but at what intermediate values are they particularly similar or clearly dissimilar? While many workers in the field have chosen arbitrary cutoffs, and others have judged values of RMSD according to the observed distribution of RMSD for random structures, we propose a self-referential, non-statistical standard. We take two conformers to be intrinsically similar if their RMSD is smaller than that when one of them is mirror inverted. Because the structures considered here are not arbitrary configurations of point atoms, but are compact, globular, polypeptide chains, our definition is closely related to similarity in radius of gyration and overall chain folding patterns. Being strongly similar in our sense implies that the radii of gyration must be nearly identical, the root-mean-square deviation in interatomic distances is linearly related to RMSD, and the two chains must have the same general fold. Only when the RMSD exceeds this level can parts of the polypeptide chain undergo nontrivial rearrangements while remaining globular. This enables us to judge when a prediction of a protein's conformation is "correct except for minor perturbations", or when the ensemble of protein structures deduced from NMR experiments are "basically in mutual agreement".
在球状蛋白质构象的研究中,人们通常通过最优刚体叠加后Cα原子坐标的均方根偏差(RMSD)来衡量三维结构的相似性。即使两种蛋白质结构均由具有相同残基数的单链组成,使得Cα原子的匹配显而易见,但如何解读RMSD仍不明确。RMSD值非常大意味着它们不相似,而零则意味着它们的构象相同,但在何种中间值时它们特别相似或明显不同呢?虽然该领域的许多研究人员选择了任意的截止值,还有一些人根据随机结构的RMSD观测分布来判断RMSD值,但我们提出了一种自参考的、非统计的标准。如果两个构象体的RMSD小于其中一个构象体经镜像反转后的RMSD,我们就认为它们本质上是相似的。因为这里所考虑的结构不是点原子的任意构型,而是紧凑的球状多肽链,所以我们的定义与回转半径和整体链折叠模式的相似性密切相关。在我们所定义的意义上高度相似意味着回转半径必须几乎相同,原子间距离的均方根偏差与RMSD呈线性相关,并且两条链必须具有相同的总体折叠方式。只有当RMSD超过这个水平时,多肽链的部分才能在保持球状的同时进行显著的重排。这使我们能够判断蛋白质构象的预测何时“除了微小扰动外是正确的”,或者从核磁共振实验推导的蛋白质结构集合何时“基本相互一致”。