Bahar I, Atilgan A R, Jernigan R L, Erman B
Molecular Structure Section, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892-5677, USA.
Proteins. 1997 Oct;29(2):172-85.
Knowledge of amino acid composition, alone, is verified here to be sufficient for recognizing the structural class, alpha, beta, alpha + beta, or alpha/beta of a given protein with an accuracy of 81%. This is supported by results from exhaustive enumerations of all conformations for all sequences of simple, compact lattice models consisting of two types (hydrophobic and polar) of residues. Different compositions exhibit strong affinities for certain folds. Within the limits of validity of the lattice models, two factors appear to determine the choice of particular folds: 1) the coordination numbers of individual sites and 2) the size and geometry of non-bonded clusters. These two properties, collectively termed the distribution of non-bonded contacts, are quantitatively assessed by an eigenvalue analysis of the so-called Kirchhoff or adjacency matrices obtained by considering the non-bonded interactions on a lattice. The analysis permits the identification of conformations that possess the same distribution of non-bonded contacts. Furthermore, some distributions of non-bonded contacts are favored entropically, due to their high degeneracies. Thus, a competition between enthalpic and entropic effects is effective in determining the choice of a distribution for a given composition. Based on these findings, an analysis of non-bonded contacts in protein structures was made. The analysis shows that proteins belonging to the four distinct folding classes exhibit significant differences in their distributions of non-bonded contacts, which more directly explains the success in predicting structural class from amino acid composition.
此处证实,仅氨基酸组成的知识就足以以81%的准确率识别给定蛋白质的结构类别,即α、β、α + β或α/β。这得到了由两种类型(疏水和极性)残基组成的简单紧凑晶格模型的所有序列的所有构象的详尽枚举结果的支持。不同的组成对某些折叠具有很强的亲和力。在晶格模型的有效性范围内,有两个因素似乎决定了特定折叠的选择:1)单个位点的配位数和2)非键合簇的大小和几何形状。这两个属性统称为非键合接触的分布,通过对通过考虑晶格上的非键合相互作用获得的所谓基尔霍夫或邻接矩阵进行特征值分析来定量评估。该分析允许识别具有相同非键合接触分布的构象。此外,由于某些非键合接触分布的高简并性,它们在熵方面更受青睐。因此,焓效应和熵效应之间的竞争有效地决定了给定组成的分布选择。基于这些发现,对蛋白质结构中的非键合接触进行了分析。分析表明,属于四个不同折叠类别的蛋白质在其非键合接触分布上表现出显著差异,这更直接地解释了从氨基酸组成预测结构类别的成功。