Jaroszewski Lukasz
The Burnham Institute, La Jolla, CA, USA.
Methods Mol Biol. 2009;569:129-56. doi: 10.1007/978-1-59745-524-4_7.
The observation that similar protein sequences fold into similar three-dimensional structures provides a basis for the methods which predict structural features of a novel protein based on the similarity between its sequence and sequences of known protein structures. Similarity over entire sequence or large sequence fragment(s) enables prediction and modeling of entire structural domains while statistics derived from distributions of local features of known protein structures make it possible to predict such features in proteins with unknown structures. The accuracy of models of protein structures is sufficient for many practical purposes such as analysis of point mutation effects, enzymatic reactions, interaction interfaces of protein complexes, and active sites. Protein models are also used for phasing of crystallographic data and, in some cases, for drug design. By using models one can avoid the costly and time-consuming process of experimental structure determination. The purpose of this chapter is to give a practical review of the most popular protein structure prediction methods based on sequence similarity and to outline a practical approach to protein structure prediction. While the main focus of this chapter is on template-based protein structure prediction, it also provides references to other methods and programs which play an important role in protein structure prediction.
相似的蛋白质序列折叠成相似的三维结构这一观察结果,为基于新蛋白质序列与已知蛋白质结构序列之间的相似性来预测其结构特征的方法提供了基础。整个序列或大的序列片段的相似性能够对整个结构域进行预测和建模,而从已知蛋白质结构的局部特征分布得出的统计数据则使得预测未知结构蛋白质中的此类特征成为可能。蛋白质结构模型的准确性对于许多实际应用来说已经足够,比如分析点突变效应、酶促反应、蛋白质复合物的相互作用界面以及活性位点。蛋白质模型还用于晶体学数据的相位确定,在某些情况下也用于药物设计。通过使用模型,可以避免实验性结构测定这一昂贵且耗时的过程。本章的目的是对基于序列相似性的最流行的蛋白质结构预测方法进行实用综述,并概述一种蛋白质结构预测的实用方法。虽然本章的主要重点是基于模板的蛋白质结构预测,但它也提供了对在蛋白质结构预测中起重要作用的其他方法和程序的参考。