Institute of Functional Genomics, University of Regensburg, Regensburg, Germany.
J Comput Chem. 2011 Sep;32(12):2575-86. doi: 10.1002/jcc.21837. Epub 2011 May 31.
One of the main challenges in protein-protein docking is a meaningful evaluation of the many putative solutions. Here we present a program (PROCOS) that calculates a probability-like measure to be native for a given complex. In contrast to scores often used for analyzing complex structures, the calculated probabilities offer the advantage of providing a fixed range of expected values. This will allow, in principle, the comparison of models corresponding to different targets that were solved with the same algorithm. Judgments are based on distributions of properties derived from a large database of native and false complexes. For complex analysis PROCOS uses these property distributions of native and false complexes together with a support vector machine (SVM). PROCOS was compared to the established scoring schemes of ZRANK and DFIRE. Employing a set of experimentally solved native complexes, high probability values above 50% were obtained for 90% of these structures. Next, the performance of PROCOS was tested on the 40 binary targets of the Dockground decoy set, on 14 targets of the RosettaDock decoy set and on 9 targets that participated in the CAPRI scoring evaluation. Again the advantage of using a probability-based scoring system becomes apparent and a reasonable number of near native complexes was found within the top ranked complexes. In conclusion, a novel fully automated method is presented that allows the reliable evaluation of protein-protein complexes.
蛋白质-蛋白质对接的主要挑战之一是对许多可能的解决方案进行有意义的评估。在这里,我们介绍了一个程序(PROCOS),它计算了一个类似于概率的度量标准,以确定给定复合物的天然状态。与常用于分析复杂结构的分数不同,计算出的概率具有提供固定预期值范围的优势。这将允许原则上比较用相同算法解决的不同靶标对应的模型。判断基于从大量天然和假复合物数据库中得出的属性分布。对于复合物分析,PROCOS 将这些天然和假复合物的属性分布与支持向量机(SVM)一起使用。PROCOS 与 ZRANK 和 DFIRE 等已建立的评分方案进行了比较。使用一组实验解决的天然复合物,其中 90%的结构获得了超过 50%的高概率值。接下来,在 Dockground 诱饵集的 40 个二元靶标、RosettaDock 诱饵集的 14 个靶标和参与 CAPRI 评分评估的 9 个靶标上测试了 PROCOS 的性能。再次,使用基于概率的评分系统的优势变得明显,并且在排名靠前的复合物中找到了相当数量的近天然复合物。总之,提出了一种新颖的全自动方法,可用于可靠地评估蛋白质-蛋白质复合物。