Levitt M
Beckman Laboratories for Structural Biology, Department of Cell Biology, Stanford University Medical Center, CA 94305.
J Mol Biol. 1992 Jul 20;226(2):507-33. doi: 10.1016/0022-2836(92)90964-l.
Segment match modeling uses a data base of highly refined known protein X-ray structures to build an unknown target structure from its amino acid sequence and the atomic coordinates of a few of its atoms (generally only the C alpha atoms). The target structure is first broken into a set of short segments. The data base is then searched for matching segments, which are fitted onto the framework of the target structure. Three criteria are used for choosing a matching data base segment: amino acid sequence similarity, conformational similarity (atomic co-ordinates), and compatibility with the target structure (van der Waals' interactions). The new method works surprisingly well: for eight test proteins ranging in size from 46 to 323 residues, the all-atom root-mean-square deviation of the modeled structures is between 0.93 A and 1.73 A (the average is 1.26 A). Deviations of this magnitude are comparable with those found for protein co-ordinates before and after refinement against X-ray data or for co-ordinates of the same protein in different crystal packings. These results are insensitive to errors in the C alpha positions or to missing C alpha atoms: accurate models can be built with C alpha errors of up to 1 A or by using only half the C alpha atoms. The fit to the X-ray structures is improved significantly by building several independent models based on different random choices and then averaging co-ordinates; this novel concept has general implications for other modeling tasks. The segment match modeling method is fully automatic, yields a complete set of atomic co-ordinates without any human intervention and is efficient (14 s/residue on the Silicon Graphics 4D/25 Personal Iris workstation.
片段匹配建模使用高度精确的已知蛋白质X射线结构数据库,根据其氨基酸序列和部分原子(通常仅Cα原子)的原子坐标来构建未知的目标结构。首先将目标结构分解为一组短片段。然后在数据库中搜索匹配片段,并将其拟合到目标结构的框架上。选择匹配数据库片段时使用三个标准:氨基酸序列相似性、构象相似性(原子坐标)以及与目标结构的兼容性(范德华相互作用)。这种新方法的效果出奇地好:对于8个大小从46到323个残基不等的测试蛋白质,建模结构的全原子均方根偏差在0.93 Å至1.73 Å之间(平均为1.26 Å)。这种偏差幅度与根据X射线数据精修前后的蛋白质坐标或不同晶体堆积中同一蛋白质的坐标所发现的偏差相当。这些结果对Cα位置的误差或缺失的Cα原子不敏感:即使Cα误差高达1 Å或仅使用一半的Cα原子,也能构建出准确的模型。通过基于不同的随机选择构建几个独立模型,然后对坐标进行平均,可以显著改善与X射线结构的拟合度;这个新颖的概念对其他建模任务具有普遍意义。片段匹配建模方法是完全自动化的,无需任何人工干预就能生成完整的原子坐标集,而且效率很高(在Silicon Graphics 4D/25 Personal Iris工作站上为14秒/残基)。