Negi Surendra S, Braun Werner
Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, 301 University Blvd, Galveston, TX 77555-0857, USA.
J Mol Model. 2007 Nov;13(11):1157-67. doi: 10.1007/s00894-007-0237-0. Epub 2007 Sep 9.
We have developed a fully automated method, InterProSurf, to predict interacting amino acid residues on protein surfaces of monomeric 3D structures. Potential interacting residues are predicted based on solvent accessible surface areas, a new scale for interface propensities, and a cluster algorithm to locate surface exposed areas with high interface propensities. Previous studies have shown the importance of hydrophobic residues and specific charge distribution as characteristics for interfaces. Here we show differences in interface and surface regions of all physical chemical properties of residues as represented by five quantitative descriptors. In the current study a set of 72 protein complexes with known 3D structures were analyzed to obtain interface propensities of residues, and to find differences in the distribution of five quantitative descriptors for amino acid residues. We also investigated spatial pair correlations of solvent accessible residues in interface and surface areas, and compared log-odds ratios for interface and surface areas. A new scoring method to predict potential functional sites on the protein surface was developed and tested for a new dataset of 21 protein complexes, which were not included in the original training dataset. Empirically we found that the algorithm achieves a good balance in the accuracy of precision and sensitivity by selecting the top eight highest scoring clusters as interface regions. The performance of the method is illustrated for a dimeric ATPase of the hyperthermophile, Methanococcus jannaschii, and the capsid protein of Human Hepatitis B virus. An automated version of the method can be accessed from our web server at http://curie.utmb.edu/prosurf.html.
我们开发了一种全自动方法InterProSurf,用于预测单体三维结构蛋白质表面上相互作用的氨基酸残基。基于溶剂可及表面积、一种新的界面倾向量表以及一种用于定位具有高界面倾向的表面暴露区域的聚类算法来预测潜在的相互作用残基。先前的研究表明,疏水残基和特定电荷分布作为界面特征的重要性。在这里,我们展示了由五个定量描述符表示的残基所有物理化学性质在界面和表面区域的差异。在当前研究中,分析了一组72个具有已知三维结构的蛋白质复合物,以获得残基的界面倾向,并找出氨基酸残基五个定量描述符分布的差异。我们还研究了界面和表面区域中溶剂可及残基的空间对相关性,并比较了界面和表面区域的对数优势比。开发了一种预测蛋白质表面潜在功能位点的新评分方法,并在一个不包含在原始训练数据集中的21个蛋白质复合物的新数据集中进行了测试。根据经验,我们发现该算法通过选择得分最高的前八个聚类作为界面区域,在精度和灵敏度的准确性方面实现了良好的平衡。以嗜热栖热菌的二聚体ATP酶和人乙型肝炎病毒的衣壳蛋白为例说明了该方法的性能。该方法的自动化版本可从我们的网页服务器http://curie.utmb.edu/prosurf.html获取。