Li Yan, Han Li, Liu Zhihai, Wang Renxiao
State Key Laboratory of Bioorganic and Natural Products Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences , 345 Lingling Road, Shanghai 200032, People's Republic of China.
J Chem Inf Model. 2014 Jun 23;54(6):1717-36. doi: 10.1021/ci500081m. Epub 2014 Jun 2.
Our comparative assessment of scoring functions (CASF) benchmark is created to provide an objective evaluation of current scoring functions. The key idea of CASF is to compare the general performance of scoring functions on a diverse set of protein-ligand complexes. In order to avoid testing scoring functions in the context of molecular docking, the scoring process is separated from the docking (or sampling) process by using ensembles of ligand binding poses that are generated in prior. Here, we describe the technical methods and evaluation results of the latest CASF-2013 study. The PDBbind core set (version 2013) was employed as the primary test set in this study, which consists of 195 protein-ligand complexes with high-quality three-dimensional structures and reliable binding constants. A panel of 20 scoring functions, most of which are implemented in main-stream commercial software, were evaluated in terms of "scoring power" (binding affinity prediction), "ranking power" (relative ranking prediction), "docking power" (binding pose prediction), and "screening power" (discrimination of true binders from random molecules). Our results reveal that the performance of these scoring functions is generally more promising in the docking/screening power tests than in the scoring/ranking power tests. Top-ranked scoring functions in the scoring power test, such as X-Score(HM), ChemScore@SYBYL, ChemPLP@GOLD, and PLP@DS, are also top-ranked in the ranking power test. Top-ranked scoring functions in the docking power test, such as ChemPLP@GOLD, Chemscore@GOLD, GlidScore-SP, LigScore@DS, and PLP@DS, are also top-ranked in the screening power test. Our results obtained on the entire test set and its subsets suggest that the real challenge in protein-ligand binding affinity prediction lies in polar interactions and associated desolvation effect. Nonadditive features observed among high-affinity protein-ligand complexes also need attention.
我们创建比较评分函数评估(CASF)基准是为了对当前评分函数进行客观评估。CASF的关键思想是在一组多样的蛋白质-配体复合物上比较评分函数的总体性能。为了避免在分子对接的背景下测试评分函数,评分过程通过使用事先生成的配体结合姿态集合与对接(或采样)过程分离。在此,我们描述最新的CASF-2013研究的技术方法和评估结果。本研究采用PDBbind核心集(2013版)作为主要测试集,它由195个具有高质量三维结构和可靠结合常数的蛋白质-配体复合物组成。对一组20个评分函数进行了评估,其中大多数在主流商业软件中实现,评估内容包括“评分能力”(结合亲和力预测)、“排序能力”(相对排序预测)、“对接能力”(结合姿态预测)和“筛选能力”(从随机分子中区分真正的结合剂)。我们的结果表明,这些评分函数在对接/筛选能力测试中的性能通常比在评分/排序能力测试中更有前景。在评分能力测试中排名靠前的评分函数,如X-Score(HM)、ChemScore@SYBYL、ChemPLP@GOLD和PLP@DS,在排序能力测试中也排名靠前。在对接能力测试中排名靠前的评分函数,如ChemPLP@GOLD、Chemscore@GOLD、GlidScore-SP、LigScore@DS和PLP@DS,在筛选能力测试中也排名靠前。我们在整个测试集及其子集上获得的结果表明,蛋白质-配体结合亲和力预测中的真正挑战在于极性相互作用和相关的去溶剂化效应。在高亲和力蛋白质-配体复合物中观察到的非加和特征也需要关注。