Tuncbag Nurcan, Gursoy Attila, Keskin Ozlem
Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Rumelifeneri Yolu, Sariyer Istanbul, Turkey.
Bioinformatics. 2009 Jun 15;25(12):1513-20. doi: 10.1093/bioinformatics/btp240. Epub 2009 Apr 8.
Hot spots are residues comprising only a small fraction of interfaces yet accounting for the majority of the binding energy. These residues are critical in understanding the principles of protein interactions. Experimental studies like alanine scanning mutagenesis require significant effort; therefore, there is a need for computational methods to predict hot spots in protein interfaces.
We present a new intuitive efficient method to determine computational hot spots based on conservation (C), solvent accessibility [accessible surface area (ASA)] and statistical pairwise residue potentials (PP) of the interface residues. Combination of these features is examined in a comprehensive way to study their effect in hot spot detection. The predicted hot spots are observed to match with the experimental hot spots with an accuracy of 70% and a precision of 64% in Alanine Scanning Energetics Database (ASEdb), and accuracy of 70% and a precision of 73% in Binding Interface Database (BID). Several machine learning methods are also applied to predict hot spots. Performance of our empirical approach exceeds learning-based methods and other existing hot spot prediction methods. Residue occlusion from solvent in the complexes and pairwise potentials are found to be the main discriminative features in hot spot prediction.
Our empirical method is a simple approach in hot spot prediction yet with its high accuracy and computational effectiveness. We believe that this method provides insights for the researchers working on characterization of protein binding sites and design of specific therapeutic agents for protein interactions.
The list of training and test sets are available as Supplementary Data at http://prism.ccbb.ku.edu.tr/hotpoint/supplement.doc.
Supplementary data are available at Bioinformatics online.
热点残基仅占蛋白质界面的一小部分,但却占据了大部分结合能。这些残基对于理解蛋白质相互作用的原理至关重要。诸如丙氨酸扫描诱变等实验研究需要付出巨大努力;因此,需要计算方法来预测蛋白质界面中的热点。
我们提出了一种基于界面残基的保守性(C)、溶剂可及性[可及表面积(ASA)]和统计成对残基势(PP)来确定计算热点的直观有效新方法。我们全面研究了这些特征的组合,以考察它们在热点检测中的作用。在丙氨酸扫描能量学数据库(ASEdb)中,预测的热点与实验热点的匹配准确率为70%,精确率为64%;在结合界面数据库(BID)中,准确率为70%,精确率为73%。我们还应用了几种机器学习方法来预测热点。我们的经验方法的性能超过了基于学习的方法和其他现有的热点预测方法。发现复合物中残基被溶剂遮挡的情况和成对势是热点预测中的主要判别特征。
我们的经验方法是一种简单的热点预测方法,但具有较高的准确性和计算效率。我们相信,这种方法为致力于蛋白质结合位点表征和蛋白质相互作用特异性治疗剂设计的研究人员提供了思路。
训练集和测试集列表可作为补充数据在http://prism.ccbb.ku.edu.tr/hotpoint/supplement.doc获取。
补充数据可在《生物信息学》在线获取。