Glaser F, Steinberg D M, Vakser I A, Ben-Tal N
Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel.
Proteins. 2001 May 1;43(2):89-102.
We used a nonredundant set of 621 protein-protein interfaces of known high-resolution structure to derive residue composition and residue-residue contact preferences. The residue composition at the interfaces, in entire proteins and in whole genomes correlates well, indicating the statistical strength of the data set. Differences between amino acid distributions were observed for interfaces with buried surface area of less than 1,000 A(2) versus interfaces with area of more than 5,000 A(2). Hydrophobic residues were abundant in large interfaces while polar residues were more abundant in small interfaces. The largest residue-residue preferences at the interface were recorded for interactions between pairs of large hydrophobic residues, such as Trp and Leu, and the smallest preferences for pairs of small residues, such as Gly and Ala. On average, contacts between pairs of hydrophobic and polar residues were unfavorable, and the charged residues tended to pair subject to charge complementarity, in agreement with previous reports. A bootstrap procedure, lacking from previous studies, was used for error estimation. It showed that the statistical errors in the set of pairing preferences are generally small; the average standard error is approximately 0.2, i.e., about 8% of the average value of the pairwise index (2.9). However, for a few pairs (e.g., Ser-Ser and Glu-Asp) the standard error is larger in magnitude than the pairing index, which makes it impossible to tell whether contact formation is favorable or unfavorable. The results are interpreted using physicochemical factors and their implications for the energetics of complex formation and for protein docking are discussed. Proteins 2001;43:89-102.
我们使用了一组由621个已知高分辨率结构的蛋白质-蛋白质界面组成的非冗余集合,以推导残基组成和残基-残基接触偏好。界面处、整个蛋白质中以及整个基因组中的残基组成具有良好的相关性,表明了数据集的统计强度。观察到埋藏表面积小于1000 Ų的界面与表面积大于5000 Ų的界面之间氨基酸分布存在差异。大界面中疏水残基丰富,而小界面中极性残基更为丰富。界面处最大的残基-残基偏好记录在大的疏水残基对之间的相互作用中,例如色氨酸和亮氨酸,而最小的偏好记录在小残基对之间,例如甘氨酸和丙氨酸。平均而言,疏水残基对和极性残基对之间的接触是不利的,带电残基倾向于根据电荷互补性配对,这与先前的报道一致。以前的研究中没有使用的自助程序用于误差估计。结果表明,配对偏好集中的统计误差通常较小;平均标准误差约为0.2,即约为成对指数平均值(2.9)的8%。然而,对于少数几对(例如,丝氨酸-丝氨酸和谷氨酸-天冬氨酸),标准误差的大小大于配对指数,这使得无法判断接触形成是有利还是不利。利用物理化学因素对结果进行了解释,并讨论了其对复合物形成能量学和蛋白质对接的影响。《蛋白质》2001年;43卷:89 - 102页。