Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA.
Proteins. 2011 Aug;79(8):2467-74. doi: 10.1002/prot.23070. Epub 2011 Jun 1.
Proteins often undergo conformational changes when binding to each other. A major fraction of backbone conformational changes involves motion on the protein surface, particularly in loops. Accounting for the motion of protein surface loops represents a challenge for protein-protein docking algorithms. A first step in addressing this challenge is to distinguish protein surface loops that are likely to undergo backbone conformational changes upon protein-protein binding (mobile loops) from those that are not (stationary loops). In this study, we developed a machine learning strategy based on support vector machines (SVMs). Our SVM uses three features of loop residues in the unbound protein structures-Ramachandran angles, crystallographic B-factors, and relative accessible surface area-to distinguish mobile loops from stationary ones. This method yields an average prediction accuracy of 75.3% compared with a random prediction accuracy of 50%, and an average of 0.79 area under the receiver operating characteristic (ROC) curve using cross-validation. Testing the method on an independent dataset, we obtained a prediction accuracy of 70.5%. Finally, we applied the method to 11 complexes that involve members from the Ras superfamily and achieved prediction accuracy of 92.8% for the Ras superfamily proteins and 74.4% for their binding partners.
蛋白质在相互结合时通常会发生构象变化。构象变化的主要部分涉及蛋白质表面的运动,特别是在环上。对于蛋白质-蛋白质对接算法来说,解释蛋白质表面环的运动是一个挑战。解决这一挑战的第一步是区分在蛋白质-蛋白质结合时可能发生骨架构象变化的蛋白质表面环(移动环)和那些不发生变化的环(固定环)。在这项研究中,我们开发了一种基于支持向量机(SVM)的机器学习策略。我们的 SVM 使用未结合蛋白质结构中环残基的三个特征——Ramachandran 角、晶体学 B 因子和相对可及表面积——来区分移动环和固定环。与随机预测的 50%相比,该方法的平均预测准确率为 75.3%,交叉验证的平均接收者操作特征(ROC)曲线下面积为 0.79。在独立数据集上测试该方法,我们得到的预测准确率为 70.5%。最后,我们将该方法应用于 11 个涉及 Ras 超家族成员的复合物,对 Ras 超家族蛋白质的预测准确率为 92.8%,对其结合伴侣的预测准确率为 74.4%。