Hilbert M, Böhm G, Jaenicke R
Institute for Biophysics and Physical Biochemistry, University of Regensburg, Germany.
Proteins. 1993 Oct;17(2):138-51. doi: 10.1002/prot.340170204.
Protein structure prediction is based mainly on the modeling of proteins by homology to known structures; this knowledge-based approach is the most promising method to date. Although it is used in the whole area of protein research, no general rules concerning the quality and applicability of concepts and procedures used in homology modeling have been put forward yet. Therefore, the main goal of the present work is to provide tools for the assessment of accuracy of modeling at a given level of sequence homology. A large set of known structures from different conformational and functional classes, but various degrees of homology was selected. Pairwise structure superpositions were performed. Starting with the definition of the structurally conserved regions and determination of topologically correct sequence alignments, we correlated geometrical properties with sequence homology (defined by the 250 PAM Dayhoff Matrix) and identity. It is shown that both the topological differences of the protein backbones and the relative positions of corresponding side chains diverge with decreasing sequence identity. Below 50% identity, the deviation in regions that are structurally not conserved continually increases, thus implying that with decreasing sequence identity modeling has to take into account more and more structurally diverging loop regions that are difficult to predict.
蛋白质结构预测主要基于通过与已知结构的同源性对蛋白质进行建模;这种基于知识的方法是迄今为止最有前途的方法。尽管它应用于蛋白质研究的整个领域,但尚未提出关于同源建模中所使用概念和程序的质量及适用性的通用规则。因此,本工作的主要目标是提供工具,用于在给定的序列同源性水平下评估建模的准确性。我们选择了一大组来自不同构象和功能类别但具有不同同源程度的已知结构。进行了成对结构叠加。从结构保守区域的定义和拓扑正确的序列比对的确定开始,我们将几何性质与序列同源性(由250 PAM Dayhoff矩阵定义)和一致性相关联。结果表明,随着序列一致性的降低,蛋白质主链的拓扑差异以及相应侧链的相对位置都会发散。在一致性低于50%时,结构不保守区域的偏差会持续增加,这意味着随着序列一致性的降低,建模必须考虑越来越多难以预测的结构发散环区域。