基于结构的虚拟筛选的改进与集成对接和机器学习。

Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning.

机构信息

Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California C.P. 22860, Mexico.

Centro de Nanociencias y Nanotecnología, Universidad Nacional Autónoma de México (UNAM), Ensenada, Baja California C.P. 22860, Mexico.

出版信息

J Chem Inf Model. 2021 Nov 22;61(11):5362-5376. doi: 10.1021/acs.jcim.1c00511. Epub 2021 Oct 15.

DOI:10.1021/acs.jcim.1c00511

PMID:34652141

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8865842/

Abstract

One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.

摘要

基于结构的虚拟筛选（SBVS）的主要挑战之一是结合受体的灵活性，因为在每次对接运行中明确表示受体的灵活性需要很高的计算成本。因此，包含受体灵活性的常用替代方法是所谓的整体对接方法。整体对接包括使用一组受体构象，并对它们中的每一个进行对接测定。然而，对于如何结合整体对接结果以获得最终的配体排名，仍然没有达成共识。一种常见的选择是使用共识策略来聚合整体对接得分，但这些策略在单结构方法方面略有改进。在这里，我们声称，使用机器学习（ML）方法对整体对接结果进行处理可以提高 SBVS 的预测能力。为了验证这一假设，选择了四种蛋白质作为研究案例：CDK2、FXa、EGFR 和 HSP90。从晶体结构构建蛋白质构象集合，而评估的化合物库包含多达三个基准数据集（DUD、DEKOIS 2.0 和 CSAR-2012）和共结晶分子。通过 30 次 4 倍交叉验证处理整体对接结果，以训练和验证两种 ML 分类器：逻辑回归和梯度提升树。我们的结果表明，ML 分类器显著优于传统共识策略，甚至优于单结构对接的最佳性能案例。我们提供了支持 ML 提高整体对接性能的有效性的统计证据。

相似文献

Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning.

J Chem Inf Model. 2021 Nov 22;61(11):5362-5376. doi: 10.1021/acs.jcim.1c00511. Epub 2021 Oct 15.

Ensemble learning from ensemble docking: revisiting the optimum ensemble size problem.

Sci Rep. 2022 Jan 10;12(1):410. doi: 10.1038/s41598-021-04448-5.

Boosted neural networks scoring functions for accurate ligand docking and ranking.

J Bioinform Comput Biol. 2018 Apr;16(2):1850004. doi: 10.1142/S021972001850004X. Epub 2018 Feb 4.

Teaching old docks new tricks with machine learning enhanced ensemble docking.

Sci Rep. 2024 Sep 5;14(1):20722. doi: 10.1038/s41598-024-71699-3.

ALADDIN: Docking Approach Augmented by Machine Learning for Protein Structure Selection Yields Superior Virtual Screening Performance.

Mol Inform. 2020 Apr;39(4):e1900103. doi: 10.1002/minf.201900103. Epub 2019 Nov 8.

SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation.

J Adv Res. 2023 Apr;46:135-147. doi: 10.1016/j.jare.2022.07.001. Epub 2022 Jul 25.

Computational representations of protein-ligand interfaces for structure-based virtual screening.

Expert Opin Drug Discov. 2021 Oct;16(10):1175-1192. doi: 10.1080/17460441.2021.1929921. Epub 2021 Jun 3.

Ensemble docking of multiple protein structures: considering protein structural variations in molecular docking.

Proteins. 2007 Feb 1;66(2):399-421. doi: 10.1002/prot.21214.

Recipes for the selection of experimental protein conformations for virtual screening.

J Chem Inf Model. 2010 Jan;50(1):186-93. doi: 10.1021/ci9003943.

Machine Learning Consensus Scoring Improves Performance Across Targets in Structure-Based Virtual Screening.

J Chem Inf Model. 2017 Jul 24;57(7):1579-1590. doi: 10.1021/acs.jcim.7b00153. Epub 2017 Jul 12.

引用本文的文献

Digital Alchemy: The Rise of Machine and Deep Learning in Small-Molecule Drug Discovery.

Int J Mol Sci. 2025 Jul 16;26(14):6807. doi: 10.3390/ijms26146807.

Using Deep Graph Neural Networks Improves Physics-Based Hydration Free Energy Predictions Even for Molecules Outside of the Training Set Distribution.

J Phys Chem B. 2025 Jul 24;129(29):7483-7498. doi: 10.1021/acs.jpcb.5c02263. Epub 2025 Jul 11.

Ensemble Docking for Intrinsically Disordered Proteins.

J Chem Inf Model. 2025 Jul 14;65(13):6847-6860. doi: 10.1021/acs.jcim.5c00370. Epub 2025 Jun 18.

Integrating Hydrogen Exchange with Molecular Dynamics for Improved Ligand Binding Predictions.

J Chem Inf Model. 2025 Jun 23;65(12):6144-6154. doi: 10.1021/acs.jcim.5c00397. Epub 2025 Jun 11.

SPLIF-Enhanced Attention-Driven 3D CNNs for Precise and Reliable Protein-Ligand Interaction Modeling for METTL3.

ACS Omega. 2025 Apr 16;10(16):16748-16761. doi: 10.1021/acsomega.5c00538. eCollection 2025 Apr 29.

Can Deep Learning Blind Docking Methods be Used to Predict Allosteric Compounds?

J Chem Inf Model. 2025 Apr 14;65(7):3737-3748. doi: 10.1021/acs.jcim.5c00331. Epub 2025 Apr 1.

Toward the Prediction of Binding Events in Very Flexible, Allosteric, Multidomain Proteins.

J Chem Inf Model. 2025 Feb 24;65(4):2052-2065. doi: 10.1021/acs.jcim.4c01810. Epub 2025 Feb 5.

Ensemble docking for intrinsically disordered proteins.

bioRxiv. 2025 Jan 26:2025.01.23.634614. doi: 10.1101/2025.01.23.634614.

In Silico Conotoxin Studies: Progress and Prospects.

Molecules. 2024 Dec 23;29(24):6061. doi: 10.3390/molecules29246061.

SurfDock is a surface-informed diffusion generative model for reliable and accurate protein-ligand complex prediction.

Nat Methods. 2025 Feb;22(2):310-322. doi: 10.1038/s41592-024-02516-y. Epub 2024 Nov 27.

本文引用的文献

Selecting machine-learning scoring functions for structure-based virtual screening.

Drug Discov Today Technol. 2019 Dec;32-33:81-87. doi: 10.1016/j.ddtec.2020.09.001. Epub 2020 Sep 19.

Using machine learning to improve ensemble docking for drug discovery.

Proteins. 2020 Oct;88(10):1263-1270. doi: 10.1002/prot.25899. Epub 2020 May 25.

Improving Docking-Based Virtual Screening Ability by Integrating Multiple Energy Auxiliary Terms from Molecular Docking Scoring.

J Chem Inf Model. 2020 Sep 28;60(9):4216-4230. doi: 10.1021/acs.jcim.9b00977. Epub 2020 May 11.

Can machine learning consistently improve the scoring power of classical scoring functions? Insights into the role of machine learning in scoring functions.

Brief Bioinform. 2021 Jan 18;22(1):497-514. doi: 10.1093/bib/bbz173.

Molecular Docking: Shifting Paradigms in Drug Discovery.

Int J Mol Sci. 2019 Sep 4;20(18):4331. doi: 10.3390/ijms20184331.

Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening.

PLoS One. 2019 Aug 20;14(8):e0220113. doi: 10.1371/journal.pone.0220113. eCollection 2019.

Comparison of Data Fusion Methods as Consensus Scores for Ensemble Docking.

Molecules. 2019 Jul 24;24(15):2690. doi: 10.3390/molecules24152690.

Exponential consensus ranking improves the outcome in docking and receptor ensemble docking.

Sci Rep. 2019 Mar 26;9(1):5142. doi: 10.1038/s41598-019-41594-3.

Holo-like and Druggable Protein Conformations from Enhanced Sampling of Binding Pocket Volume and Shape.

J Chem Inf Model. 2019 Apr 22;59(4):1515-1528. doi: 10.1021/acs.jcim.8b00730. Epub 2019 Mar 28.

In Need of Bias Control: Evaluating Chemical Data for Machine Learning in Structure-Based Virtual Screening.

J Chem Inf Model. 2019 Mar 25;59(3):947-961. doi: 10.1021/acs.jcim.8b00712. Epub 2019 Mar 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于结构的虚拟筛选的改进与集成对接和机器学习。

Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning.

机构信息

Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California C.P. 22860, Mexico.

Centro de Nanociencias y Nanotecnología, Universidad Nacional Autónoma de México (UNAM), Ensenada, Baja California C.P. 22860, Mexico.

出版信息

J Chem Inf Model. 2021 Nov 22;61(11):5362-5376. doi: 10.1021/acs.jcim.1c00511. Epub 2021 Oct 15.

DOI:10.1021/acs.jcim.1c00511

PMID:34652141

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8865842/

Abstract

摘要

基于结构的虚拟筛选的改进与集成对接和机器学习。

Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于结构的虚拟筛选的改进与集成对接和机器学习。

Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning.

机构信息

出版信息