Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215.
Proteins. 2013 Nov;81(11):1874-84. doi: 10.1002/prot.24343. Epub 2013 Aug 19.
Most structure prediction algorithms consist of initial sampling of the conformational space, followed by rescoring and possibly refinement of a number of selected structures. Here we focus on protein docking, and show that while decoupling sampling and scoring facilitates method development, integration of the two steps can lead to substantial improvements in docking results. Since decoupling is usually achieved by generating a decoy set containing both non-native and near-native docked structures, which can be then used for scoring function construction, we first review the roles and potential pitfalls of decoys in protein-protein docking, and show that some type of decoys are better than others for method development. We then describe three case studies showing that complete decoupling of scoring from sampling is not the best choice for solving realistic docking problems. Although some of the examples are based on our own experience, the results of the CAPRI docking and scoring experiments also show that performing both sampling and scoring generally yields better results than scoring the structures generated by all predictors. Next we investigate how the selection of training and decoy sets affects the performance of the scoring functions obtained. Finally, we discuss pathways to better alignment of the two steps, and show some algorithms that achieve a certain level of integration. Although we focus on protein-protein docking, our observations most likely also apply to other conformational search problems, including protein structure prediction and the docking of small molecules to proteins.
大多数结构预测算法包括构象空间的初始采样,然后对多个选定结构进行重新评分和可能的细化。在这里,我们专注于蛋白质对接,并表明尽管解耦采样和评分有助于方法开发,但将这两个步骤集成可以显著提高对接结果。由于解耦通常是通过生成包含非天然和近天然对接结构的诱饵集来实现的,然后可以使用这些结构来构建评分函数,因此我们首先回顾诱饵在蛋白质-蛋白质对接中的作用和潜在陷阱,并表明对于方法开发,某些类型的诱饵比其他诱饵更好。然后,我们描述了三个案例研究,表明完全解耦评分和采样并不是解决实际对接问题的最佳选择。虽然其中一些示例基于我们自己的经验,但 CAPRI 对接和评分实验的结果也表明,与对所有预测器生成的结构进行评分相比,同时进行采样和评分通常会产生更好的结果。接下来,我们研究了训练和诱饵集的选择如何影响获得的评分函数的性能。最后,我们讨论了更好地对齐这两个步骤的途径,并展示了一些实现一定程度集成的算法。尽管我们专注于蛋白质-蛋白质对接,但我们的观察结果很可能也适用于其他构象搜索问题,包括蛋白质结构预测和小分子与蛋白质的对接。