2014年临床研究分析报告：一项使用制药行业未公开数据的基准测试。

CSAR 2014: A Benchmark Exercise Using Unpublished Data from Pharma.

作者信息

Carlson Heather A, Smith Richard D, Damm-Ganamet Kelly L, Stuckey Jeanne A, Ahmed Aqeel, Convery Maire A, Somers Donald O, Kranz Michael, Elkins Patricia A, Cui Guanglei, Peishoff Catherine E, Lambert Millard H, Dunbar James B

机构信息

Department of Medicinal Chemistry, College of Pharmacy, University of Michigan , 428 Church St., Ann Arbor, Michigan 48109-1065, United States.

Center for Structural Biology, University of Michigan , 3358E Life Sciences Institute, 210 Washtenaw Ave., Ann Arbor, Michigan 48109-2216, United States.

出版信息

J Chem Inf Model. 2016 Jun 27;56(6):1063-77. doi: 10.1021/acs.jcim.5b00523. Epub 2016 May 17.

DOI:10.1021/acs.jcim.5b00523

PMID:27149958

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5228621/

Abstract

The 2014 CSAR Benchmark Exercise was the last community-wide exercise that was conducted by the group at the University of Michigan, Ann Arbor. For this event, GlaxoSmithKline (GSK) donated unpublished crystal structures and affinity data from in-house projects. Three targets were used: tRNA (m1G37) methyltransferase (TrmD), Spleen Tyrosine Kinase (SYK), and Factor Xa (FXa). A particularly strong feature of the GSK data is its large size, which lends greater statistical significance to comparisons between different methods. In Phase 1 of the CSAR 2014 Exercise, participants were given several protein-ligand complexes and asked to identify the one near-native pose from among 200 decoys provided by CSAR. Though decoys were requested by the community, we found that they complicated our analysis. We could not discern whether poor predictions were failures of the chosen method or an incompatibility between the participant's method and the setup protocol we used. This problem is inherent to decoys, and we strongly advise against their use. In Phase 2, participants had to dock and rank/score a set of small molecules given only the SMILES strings of the ligands and a protein structure with a different ligand bound. Overall, docking was a success for most participants, much better in Phase 2 than in Phase 1. However, scoring was a greater challenge. No particular approach to docking and scoring had an edge, and successful methods included empirical, knowledge-based, machine-learning, shape-fitting, and even those with solvation and entropy terms. Several groups were successful in ranking TrmD and/or SYK, but ranking FXa ligands was intractable for all participants. Methods that were able to dock well across all submitted systems include MDock,1 Glide-XP,2 PLANTS,3 Wilma,4 Gold,5 SMINA,6 Glide-XP2/PELE,7 FlexX,8 and MedusaDock.9 In fact, the submission based on Glide-XP2/PELE7 cross-docked all ligands to many crystal structures, and it was particularly impressive to see success across an ensemble of protein structures for multiple targets. For scoring/ranking, submissions that showed statistically significant achievement include MDock1 using ITScore1,10 with a flexible-ligand term,11 SMINA6 using Autodock-Vina,12,13 FlexX8 using HYDE,14 and Glide-XP2 using XP DockScore2 with and without ROCS15 shape similarity.16 Of course, these results are for only three protein targets, and many more systems need to be investigated to truly identify which approaches are more successful than others. Furthermore, our exercise is not a competition.

摘要

2014年CSAR基准测试是密歇根大学安娜堡分校的该团队开展的最后一次全社区范围的测试。针对此次活动，葛兰素史克公司（GSK）捐赠了来自内部项目的未发表的晶体结构和亲和力数据。使用了三个靶点：tRNA（m1G37）甲基转移酶（TrmD）、脾酪氨酸激酶（SYK）和凝血因子Xa（FXa）。GSK数据的一个特别突出的特点是其规模庞大，这使得不同方法之间的比较具有更大的统计显著性。在2014年CSAR测试的第一阶段，参与者得到了几个蛋白质-配体复合物，并被要求从CSAR提供的200个诱饵中识别出接近天然构象的那个。尽管社区要求提供诱饵，但我们发现它们使我们的分析变得复杂。我们无法辨别预测不佳是所选方法的失败，还是参与者的方法与我们使用的设置协议不兼容。这个问题是诱饵所固有的，我们强烈建议不要使用它们。在第二阶段，参与者仅根据配体的SMILES字符串和结合了不同配体的蛋白质结构，对接并对一组小分子进行排名/评分。总体而言，对接对大多数参与者来说是成功的，在第二阶段比在第一阶段要好得多。然而，评分是一个更大的挑战。没有哪种特定的对接和评分方法具有优势，成功的方法包括经验性的、基于知识的、机器学习的、形状拟合的，甚至包括那些带有溶剂化和熵项的方法。几个小组成功地对TrmD和/或SYK进行了排名，但对所有参与者来说，对FXa配体进行排名都很棘手。能够在所有提交的系统中都对接良好的方法包括MDock、1 Glide-XP、2 PLANTS、3 Wilma、4 Gold、5 SMINA、6 Glide-XP2/PELE、7 FlexX、8和MedusaDock。9事实上，基于Glide-XP2/PELE7的提交将所有配体交叉对接至许多晶体结构，并且在多个靶点的一组蛋白质结构中都取得成功尤其令人印象深刻。对于评分/排名，显示出具有统计学显著成果的提交包括使用ITScore1、10并带有柔性配体项的MDock1、11使用Autodock-Vina的SMINA6、12、13使用HYDE的FlexX8、14以及使用带有和不带有ROCS15形状相似性的XP DockScore2的Glide-XP2。16当然，这些结果仅针对三个蛋白质靶点，还需要研究更多的系统才能真正确定哪些方法比其他方法更成功。此外，我们的测试不是一场竞赛。

相似文献

CSAR 2014: A Benchmark Exercise Using Unpublished Data from Pharma.

J Chem Inf Model. 2016 Jun 27;56(6):1063-77. doi: 10.1021/acs.jcim.5b00523. Epub 2016 May 17.

Docking and Scoring with Target-Specific Pose Classifier Succeeds in Native-Like Pose Identification But Not Binding Affinity Prediction in the CSAR 2014 Benchmark Exercise.

J Chem Inf Model. 2016 Jun 27;56(6):1032-41. doi: 10.1021/acs.jcim.5b00751. Epub 2016 Apr 20.

Blind Pose Prediction, Scoring, and Affinity Ranking of the CSAR 2014 Dataset.

J Chem Inf Model. 2016 Jun 27;56(6):996-1003. doi: 10.1021/acs.jcim.5b00337. Epub 2015 Oct 1.

Choosing the Optimal Rigid Receptor for Docking and Scoring in the CSAR 2013/2014 Experiment.

J Chem Inf Model. 2016 Jun 27;56(6):1004-12. doi: 10.1021/acs.jcim.5b00338. Epub 2015 Aug 7.

Boosted neural networks scoring functions for accurate ligand docking and ranking.

J Bioinform Comput Biol. 2018 Apr;16(2):1850004. doi: 10.1142/S021972001850004X. Epub 2018 Feb 4.

Integration of Ligand and Structure Based Approaches for CSAR-2014.

J Chem Inf Model. 2016 Jun 27;56(6):974-87. doi: 10.1021/acs.jcim.5b00477. Epub 2015 Nov 5.

Combined Approach of Patch-Surfer and PL-PatchSurfer for Protein-Ligand Binding Prediction in CSAR 2013 and 2014.

J Chem Inf Model. 2016 Jun 27;56(6):1088-99. doi: 10.1021/acs.jcim.5b00625. Epub 2015 Dec 30.

Evaluation of the Wilma-SIE Virtual Screening Method in Community Structure-Activity Resource 2013 and 2014 Blind Challenges.

J Chem Inf Model. 2016 Jun 27;56(6):955-64. doi: 10.1021/acs.jcim.5b00278. Epub 2015 Aug 24.

CSAR benchmark exercise 2011-2012: evaluation of results from docking and relative ranking of blinded congeneric series.

J Chem Inf Model. 2013 Aug 26;53(8):1853-70. doi: 10.1021/ci400025f. Epub 2013 May 10.

Target-specific native/decoy pose classifier improves the accuracy of ligand ranking in the CSAR 2013 benchmark.

J Chem Inf Model. 2015 Jan 26;55(1):63-71. doi: 10.1021/ci500519w. Epub 2014 Dec 18.

引用本文的文献

Benchmarking 3D Structure-Based Molecule Generators.

J Chem Inf Model. 2025 Aug 11;65(15):8006-8021. doi: 10.1021/acs.jcim.5c01020. Epub 2025 Jul 25.

New Insights into the Anticancer Effects and Toxicogenomic Safety of Two β-Lapachone Derivatives.

Pharmaceuticals (Basel). 2025 Jun 3;18(6):837. doi: 10.3390/ph18060837.

RosettaAMRLD: A Reaction-Driven Approach for Structure-Based Drug Design from Combinatorial Libraries with Monte Carlo Metropolis Algorithms.

J Chem Inf Model. 2025 Jun 23;65(12):5945-5959. doi: 10.1021/acs.jcim.5c00497. Epub 2025 Jun 11.

Robustly interrogating machine learning-based scoring functions: what are they learning?

Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf040.

BindingDB in 2024: a FAIR knowledgebase of protein-small molecule binding data.

Nucleic Acids Res. 2025 Jan 6;53(D1):D1633-D1644. doi: 10.1093/nar/gkae1075.

Template-guided method for protein-ligand complex structure prediction: Application to CASP15 protein-ligand studies.

Proteins. 2023 Dec;91(12):1829-1836. doi: 10.1002/prot.26535. Epub 2023 Jun 7.

Recent PELE Developments and Applications in Drug Discovery Campaigns.

Int J Mol Sci. 2022 Dec 17;23(24):16090. doi: 10.3390/ijms232416090.

Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review.

Front Bioinform. 2022 Jun 17;2. doi: 10.3389/fbinf.2022.885983.

SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation.

J Adv Res. 2023 Apr;46:135-147. doi: 10.1016/j.jare.2022.07.001. Epub 2022 Jul 25.

Protein-Ligand Docking in the Machine-Learning Era.

Molecules. 2022 Jul 18;27(14):4568. doi: 10.3390/molecules27144568.

本文引用的文献

Combined Approach of Patch-Surfer and PL-PatchSurfer for Protein-Ligand Binding Prediction in CSAR 2013 and 2014.

J Chem Inf Model. 2016 Jun 27;56(6):1088-99. doi: 10.1021/acs.jcim.5b00625. Epub 2015 Dec 30.

PELE: Protein Energy Landscape Exploration. A Novel Monte Carlo Based Technique.

J Chem Theory Comput. 2005 Nov;1(6):1304-11. doi: 10.1021/ct0501811.

Rapid Prediction of Solvation Free Energy. 2. The First-Shell Hydration (FiSH) Continuum Model.

J Chem Theory Comput. 2010 May 11;6(5):1622-37. doi: 10.1021/ct9006037. Epub 2010 Apr 2.

Evaluation of GalaxyDock Based on the Community Structure-Activity Resource 2013 and 2014 Benchmark Studies.

J Chem Inf Model. 2016 Jun 27;56(6):988-95. doi: 10.1021/acs.jcim.5b00309. Epub 2015 Nov 30.

Predicting Binding Poses and Affinities in the CSAR 2013-2014 Docking Exercises Using the Knowledge-Based Convex-PL Potential.

J Chem Inf Model. 2016 Jun 27;56(6):1053-62. doi: 10.1021/acs.jcim.5b00339. Epub 2015 Nov 25.

Integration of Ligand and Structure Based Approaches for CSAR-2014.

J Chem Inf Model. 2016 Jun 27;56(6):974-87. doi: 10.1021/acs.jcim.5b00477. Epub 2015 Nov 5.

CSAR Benchmark Exercise 2013: Evaluation of Results from a Combined Computational Protein Design, Docking, and Scoring/Ranking Challenge.

J Chem Inf Model. 2016 Jun 27;56(6):1022-31. doi: 10.1021/acs.jcim.5b00387. Epub 2015 Oct 9.

Blind Pose Prediction, Scoring, and Affinity Ranking of the CSAR 2014 Dataset.

J Chem Inf Model. 2016 Jun 27;56(6):996-1003. doi: 10.1021/acs.jcim.5b00337. Epub 2015 Oct 1.

Iterative Knowledge-Based Scoring Functions Derived from Rigid and Flexible Decoy Structures: Evaluation with the 2013 and 2014 CSAR Benchmarks.

J Chem Inf Model. 2016 Jun 27;56(6):1013-21. doi: 10.1021/acs.jcim.5b00504. Epub 2015 Oct 1.

HybridDock: A Hybrid Protein-Ligand Docking Protocol Integrating Protein- and Ligand-Based Approaches.

J Chem Inf Model. 2016 Jun 27;56(6):1078-87. doi: 10.1021/acs.jcim.5b00275. Epub 2015 Sep 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

2014年临床研究分析报告：一项使用制药行业未公开数据的基准测试。

CSAR 2014: A Benchmark Exercise Using Unpublished Data from Pharma.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

2014年临床研究分析报告：一项使用制药行业未公开数据的基准测试。

CSAR 2014: A Benchmark Exercise Using Unpublished Data from Pharma.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献