2010 年 CSAR 基准测试练习：蛋白质-配体复合物的选择。

CSAR benchmark exercise of 2010: selection of the protein-ligand complexes.

机构信息

Department of Medicinal Chemistry, University of Michigan, Ann Arbor, Michigan 48109-1065, United States.

出版信息

J Chem Inf Model. 2011 Sep 26;51(9):2036-46. doi: 10.1021/ci200082t. Epub 2011 Jul 22.

DOI:10.1021/ci200082t

PMID:21728306

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3180202/

Abstract

A major goal in drug design is the improvement of computational methods for docking and scoring. The Community Structure Activity Resource (CSAR) aims to collect available data from industry and academia which may be used for this purpose ( www.csardock.org ). Also, CSAR is charged with organizing community-wide exercises based on the collected data. The first of these exercises was aimed to gauge the overall state of docking and scoring, using a large and diverse data set of protein-ligand complexes. Participants were asked to calculate the affinity of the complexes as provided and then recalculate with changes which may improve their specific method. This first data set was selected from existing PDB entries which had binding data (K(d) or K(i)) in Binding MOAD, augmented with entries from PDB bind. The final data set contains 343 diverse protein-ligand complexes and spans 14 pK(d). Sixteen proteins have three or more complexes in the data set, from which a user could start an inspection of congeneric series. Inherent experimental error limits the possible correlation between scores and measured affinity; Pearson R is limited to ~ 0.91 (Pearson R2 0.83) when fitting to the data set without over parameterizing. Pearson R is limited to ~ 0.83(Pearson R2 ~ 0.70) when scoring the data set with a method trained on outside data [corrected]. The details of how the data set was initially selected, and the process by which it matured to better fit the needs of the community are presented. Many groups generously participated in improving the data set, and this underscores the value of a supportive, collaborative effort in moving our field forward.

摘要

药物设计的主要目标是改进对接和评分的计算方法。社区结构活性资源（CSAR）旨在收集来自工业界和学术界的可用数据，这些数据可用于此目的（www.csardock.org）。此外，CSAR 负责根据收集的数据组织全社区的练习。这些练习中的第一个旨在使用大量多样的蛋白质 - 配体复合物数据集来评估对接和评分的总体状态。要求参与者根据提供的复合物计算亲和力，然后用可能改进其特定方法的变化重新计算。该数据集最初是从具有结合数据（K(d)或 K(i)）的现有 PDB 条目（Binding MOAD）中选择的，并用来自 PDB bind 的条目进行扩充。最终数据集包含 343 个不同的蛋白质 - 配体复合物，跨越 14 个 pK(d)。16 个蛋白质在数据集中有三个或更多的复合物，用户可以从这些复合物开始检查同系物系列。固有实验误差限制了评分与测量亲和力之间可能的相关性；当不过度参数化拟合数据集时，Pearson R 限制在 ~ 0.91（Pearson R2 0.83）。当使用在外部数据上训练的方法对数据集进行评分时，Pearson R 限制在 ~ 0.83（Pearson R2 ~ 0.70）[更正]。介绍了数据集最初如何选择的详细信息，以及如何使其成熟以更好地满足社区需求的过程。许多团体慷慨参与了数据集的改进，这突显了在推动我们的领域前进方面，支持和协作努力的价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b870/3180202/c2640233461d/ci-2011-00082t_0005.jpg

相似文献

CSAR benchmark exercise of 2010: selection of the protein-ligand complexes.2010 年 CSAR 基准测试练习：蛋白质-配体复合物的选择。

J Chem Inf Model. 2011 Sep 26;51(9):2036-46. doi: 10.1021/ci200082t. Epub 2011 Jul 22.

CSAR data set release 2012: ligands, affinities, complexes, and docking decoys.CSAR 数据集 2012 版：配体、亲和力、复合物和对接伪影。

J Chem Inf Model. 2013 Aug 26;53(8):1842-52. doi: 10.1021/ci4000486. Epub 2013 May 10.

CSAR benchmark exercise of 2010: combined evaluation across all submitted scoring functions.2010 年的 CSAR 基准测试练习：所有提交的评分函数的综合评估。

J Chem Inf Model. 2011 Sep 26;51(9):2115-31. doi: 10.1021/ci200269q. Epub 2011 Aug 29.

Docking and Scoring with Target-Specific Pose Classifier Succeeds in Native-Like Pose Identification But Not Binding Affinity Prediction in the CSAR 2014 Benchmark Exercise. docking 和 scoring 与目标特定的 pose 分类器相结合，成功地实现了类似天然构象的 pose 识别，但在 CSAR 2014 基准测试中不能预测结合亲和力。

J Chem Inf Model. 2016 Jun 27;56(6):1032-41. doi: 10.1021/acs.jcim.5b00751. Epub 2016 Apr 20.

Scoring and lessons learned with the CSAR benchmark using an improved iterative knowledge-based scoring function.使用改进的迭代知识基评分函数对 CSAR 基准进行评分和经验教训。

J Chem Inf Model. 2011 Sep 26;51(9):2097-106. doi: 10.1021/ci2000727. Epub 2011 Aug 31.

CSAR 2014: A Benchmark Exercise Using Unpublished Data from Pharma.2014年临床研究分析报告：一项使用制药行业未公开数据的基准测试。

J Chem Inf Model. 2016 Jun 27;56(6):1063-77. doi: 10.1021/acs.jcim.5b00523. Epub 2016 May 17.

CSAR Benchmark Exercise 2013: Evaluation of Results from a Combined Computational Protein Design, Docking, and Scoring/Ranking Challenge.2013年CSAR基准测试：综合计算蛋白质设计、对接以及评分/排名挑战的结果评估

J Chem Inf Model. 2016 Jun 27;56(6):1022-31. doi: 10.1021/acs.jcim.5b00387. Epub 2015 Oct 9.

Combined Approach of Patch-Surfer and PL-PatchSurfer for Protein-Ligand Binding Prediction in CSAR 2013 and 2014.2013年和2014年CSAR中用于蛋白质-配体结合预测的Patch-Surfer与PL-PatchSurfer联合方法

J Chem Inf Model. 2016 Jun 27;56(6):1088-99. doi: 10.1021/acs.jcim.5b00625. Epub 2015 Dec 30.

Solvated interaction energy (SIE) for scoring protein-ligand binding affinities. 2. Benchmark in the CSAR-2010 scoring exercise.溶剂化相互作用能（SIE）用于评分蛋白质-配体结合亲和力。2. 在 CSAR-2010 评分练习中的基准。

J Chem Inf Model. 2011 Sep 26;51(9):2066-81. doi: 10.1021/ci2000242. Epub 2011 Jul 13.

CSAR benchmark exercise 2011-2012: evaluation of results from docking and relative ranking of blinded congeneric series.CSAR 基准测试练习 2011-2012：对接结果评估和盲测同类系列的相对排名。

J Chem Inf Model. 2013 Aug 26;53(8):1853-70. doi: 10.1021/ci400025f. Epub 2013 May 10.

引用本文的文献

EM-PLA: environment-aware heterogeneous graph-based multimodal protein-ligand binding affinity prediction.EM-PLA：基于环境感知异构图的多模态蛋白质-配体结合亲和力预测

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf298.

Edge-enhanced interaction graph network for protein-ligand binding affinity prediction.用于蛋白质-配体结合亲和力预测的边缘增强相互作用图网络。

PLoS One. 2025 Apr 8;20(4):e0320465. doi: 10.1371/journal.pone.0320465. eCollection 2025.

A workflow to create a high-quality protein-ligand binding dataset for training, validation, and prediction tasks.一种用于创建高质量蛋白质-配体结合数据集以进行训练、验证和预测任务的工作流程。

Digit Discov. 2025 Apr 2;4(5):1209-1220. doi: 10.1039/d4dd00357h. eCollection 2025 May 14.

Rationalizing protein-ligand interactions via the effective fragment potential method and structural data from classical molecular dynamics.通过有效片段势方法和经典分子动力学的结构数据使蛋白质-配体相互作用合理化。

J Chem Phys. 2025 Jan 28;162(4). doi: 10.1063/5.0247878.

Improved Prediction of Ligand-Protein Binding Affinities by Meta-modeling.通过元建模改进配体-蛋白质结合亲和力的预测

J Chem Inf Model. 2024 Dec 9;64(23):8684-8704. doi: 10.1021/acs.jcim.4c01116. Epub 2024 Nov 22.

BindingDB in 2024: a FAIR knowledgebase of protein-small molecule binding data.2024年的BindingDB：蛋白质-小分子结合数据的可 FAIR 化知识库。

Nucleic Acids Res. 2025 Jan 6;53(D1):D1633-D1644. doi: 10.1093/nar/gkae1075.

Ensembling methods for protein-ligand binding affinity prediction.基于集成方法的蛋白质-配体结合亲和力预测。

Sci Rep. 2024 Oct 18;14(1):24447. doi: 10.1038/s41598-024-72784-3.

Addressing docking pose selection with structure-based deep learning: Recent advances, challenges and opportunities.利用基于结构的深度学习解决对接姿势选择问题：最新进展、挑战与机遇

Comput Struct Biotechnol J. 2024 May 18;23:2141-2151. doi: 10.1016/j.csbj.2024.05.024. eCollection 2024 Dec.

Distance plus attention for binding affinity prediction.用于结合亲和力预测的距离加注意力机制

J Cheminform. 2024 May 12;16(1):52. doi: 10.1186/s13321-024-00844-x.

A generalized protein-ligand scoring framework with balanced scoring, docking, ranking and screening powers.一个具有平衡评分、对接、排序和筛选能力的通用蛋白质-配体评分框架。

Chem Sci. 2023 Jul 4;14(30):8129-8146. doi: 10.1039/d3sc02044d. eCollection 2023 Aug 2.

本文引用的文献

PDB_REDO: automated re-refinement of X-ray structure models in the PDB.PDB_REDO：蛋白质数据库（PDB）中X射线结构模型的自动重新精修

J Appl Crystallogr. 2009 Jun 1;42(Pt 3):376-384. doi: 10.1107/S0021889809008784. Epub 2009 Apr 3.

Conformer generation with OMEGA: algorithm and validation using high quality structures from the Protein Databank and Cambridge Structural Database.使用 OMEGA 生成构象：使用来自蛋白质数据库和剑桥结构数据库的高质量结构进行算法验证。

J Chem Inf Model. 2010 Apr 26;50(4):572-84. doi: 10.1021/ci100031x.

Structural artifacts in protein-ligand X-ray structures: implications for the development of docking scoring functions.蛋白质-配体X射线结构中的结构假象：对对接评分函数开发的影响。

J Med Chem. 2009 Sep 24;52(18):5673-84. doi: 10.1021/jm8016464.

Healthy skepticism: assessing realistic model performance.合理怀疑：评估现实模型性能

Drug Discov Today. 2009 Apr;14(7-8):420-7. doi: 10.1016/j.drudis.2009.01.012.

Recommendations for evaluation of computational methods.计算方法评估建议。

J Comput Aided Mol Des. 2008 Mar-Apr;22(3-4):133-9. doi: 10.1007/s10822-008-9196-5. Epub 2008 Mar 13.

How to do an evaluation: pitfalls and traps.如何进行评估：陷阱与误区

J Comput Aided Mol Des. 2008 Mar-Apr;22(3-4):179-90. doi: 10.1007/s10822-007-9166-3. Epub 2008 Jan 23.

Binding MOAD, a high-quality protein-ligand database.绑定MOAD，一个高质量的蛋白质-配体数据库。

Nucleic Acids Res. 2008 Jan;36(Database issue):D674-8. doi: 10.1093/nar/gkm911. Epub 2007 Nov 30.

Diverse, high-quality test set for the validation of protein-ligand docking performance.用于验证蛋白质-配体对接性能的多样、高质量测试集。

J Med Chem. 2007 Feb 22;50(4):726-41. doi: 10.1021/jm061277y.

Real-space solution to the problem of full disclosure.完全披露问题的实空间解决方案。

Nature. 2006 Dec 14;444(7121):817. doi: 10.1038/444817b.

BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities.BindingDB：一个可通过网络访问的、实验测定的蛋白质-配体结合亲和力数据库。

Nucleic Acids Res. 2007 Jan;35(Database issue):D198-201. doi: 10.1093/nar/gkl999. Epub 2006 Dec 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

2010 年 CSAR 基准测试练习：蛋白质-配体复合物的选择。

CSAR benchmark exercise of 2010: selection of the protein-ligand complexes.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献