Institute of Molecular and Cellular Biosciences, The University of Tokyo, Tokyo, Japan.
Proteins. 2013 Jun;81(6):1005-16. doi: 10.1002/prot.24252. Epub 2013 Feb 27.
We propose a fast clustering and reranking method, CyClus, for protein-protein docking decoys. This method enables comprehensive clustering of whole decoys generated by rigid-body docking using cylindrical approximation of the protein-proteininterface and hierarchical clustering procedures. We demonstrate the clustering and reranking of 54,000 decoy structures generated by ZDOCK for each complex within a few minutes. After parameter tuning for the test set in ZDOCK benchmark 2.0 with the ZDOCK and ZRANK scoring functions, blind tests for the incremental data in ZDOCK benchmark 3.0 and 4.0 were conducted. CyClus successfully generated smaller subsets of decoys containing near-native decoys. For example, the number of decoys required to create subsets containing near-native decoys with 80% probability was reduced from 22% to 50% of the number required in the original ZDOCK. Although specific ZDOCK and ZRANK results were demonstrated, the CyClus algorithm was designed to be more general and can be applied to a wide range of decoys and scoring functions by adjusting just two parameters, p and T. CyClus results were also compared to those from ClusPro.
我们提出了一种快速聚类和重新排序方法 CyClus,用于蛋白质-蛋白质对接诱饵。该方法能够使用蛋白质-蛋白质界面的圆柱近似和层次聚类过程对刚体对接生成的所有诱饵进行全面聚类。我们在几分钟内对每个复合物内由 ZDOCK 生成的 54000 个诱饵结构进行聚类和重新排序。在 ZDOCK 基准 2.0 中使用 ZDOCK 和 ZRANK 评分函数对测试集进行参数调整后,对 ZDOCK 基准 3.0 和 4.0 的增量数据进行了盲测。CyClus 成功地生成了更小的诱饵子集,其中包含接近天然的诱饵。例如,创建包含 80%概率的近天然诱饵子集所需的诱饵数量从原始 ZDOCK 的 22%减少到 50%。尽管展示了特定的 ZDOCK 和 ZRANK 结果,但 CyClus 算法旨在更加通用,并且可以通过调整两个参数 p 和 T 应用于广泛的诱饵和评分函数。还将 CyClus 的结果与 ClusPro 的结果进行了比较。