Crippen G M
College of Pharmacy, University of Michigan, Ann Arbor 48109.
Biochemistry. 1991 Apr 30;30(17):4232-7. doi: 10.1021/bi00231a018.
Predicting the three-dimensional structure of a protein given only its amino acid sequence is a long-standing goal in computational chemistry. In the thermodynamic approach, one needs a potential function of conformation that resembles the free energy of the real protein to the extent that the global minimum of the potential is attained by the native conformation and no other. In practice, this has never been achieved with certainty because even with greatly simplified representations of the polypeptide chain, there are an astronomical number of local minima to examine. If one chooses instead a protein representation with only a large but manageable number of discrete conformations, then the global preference of the potential for the native can be directly verified. Representing a protein as a walk on a two-dimensional square lattice makes it easy to see that simple functions of the interresidue contacts are sufficient to globally favor a given "native" conformation, as long as it is a compact, globular structure. Explicit representation of the solvent is not required. Another more realistic way to confine the conformational search to a finite set is to draw alternative conformations from fragments of larger proteins having known crystal structure. Then it is possible to construct a simple function of interresidue contacts in three dimensions such that only 8 proteins are required to determine the adjustable parameters, and the native conformations of 37 other proteins are correctly preferred over all alternative conformations. The deduced function favors short-range backbone-backbone contacts regardless of residue type and long-range hydrophobic associations. Interactions over long distances, such as electrostatics, are not required.
仅根据蛋白质的氨基酸序列预测其三维结构是计算化学中长期以来的目标。在热力学方法中,需要一个构象势能函数,该函数要在一定程度上类似于真实蛋白质的自由能,即势能的全局最小值由天然构象达到,且别无其他构象。实际上,这从未被确切实现,因为即使对多肽链进行了极大简化的表示,仍有天文数字的局部最小值需要检验。如果改为选择一种仅具有大量但可管理数量的离散构象的蛋白质表示形式,那么势能对天然构象的全局偏好就可以直接得到验证。将蛋白质表示为二维方格上的行走,很容易看出只要给定的“天然”构象是紧凑的球状结构,残基间接触的简单函数就足以全局偏向该构象。不需要明确表示溶剂。将构象搜索限制在有限集合的另一种更现实的方法是从具有已知晶体结构的较大蛋白质片段中绘制替代构象。然后可以构建一个三维残基间接触的简单函数,使得仅需要8种蛋白质来确定可调参数,并且相对于所有替代构象,其他37种蛋白质的天然构象被正确地优先选择。推导得到的函数有利于短程主链-主链接触,而不考虑残基类型以及长程疏水缔合。不需要长距离相互作用,如静电作用。