校正对接姿势生成误差对结合亲和力预测的影响。

Correcting the impact of docking pose generation error on binding affinity prediction.

作者信息

Li Hongjian, Leung Kwong-Sak, Wong Man-Hon, Ballester Pedro J

机构信息

Department of Computer Science and Engineering, Chinese University of Hong Kong, Hong Kong, China.

Cancer Research Center of Marseille, INSERM U1068, Marseille, F-13009, France.

出版信息

BMC Bioinformatics. 2016 Sep 22;17(Suppl 11):308. doi: 10.1186/s12859-016-1169-4.

DOI:10.1186/s12859-016-1169-4

PMID:28185549

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5046193/

Abstract

BACKGROUND

Pose generation error is usually quantified as the difference between the geometry of the pose generated by the docking software and that of the same molecule co-crystallised with the considered protein. Surprisingly, the impact of this error on binding affinity prediction is yet to be systematically analysed across diverse protein-ligand complexes.

RESULTS

Against commonly-held views, we have found that pose generation error has generally a small impact on the accuracy of binding affinity prediction. This is also true for large pose generation errors and it is not only observed with machine-learning scoring functions, but also with classical scoring functions such as AutoDock Vina. Furthermore, we propose a procedure to correct a substantial part of this error which consists of calibrating the scoring functions with re-docked, rather than co-crystallised, poses. In this way, the relationship between Vina-generated protein-ligand poses and their binding affinities is directly learned. As a result, test set performance after this error-correcting procedure is much closer to that of predicting the binding affinity in the absence of pose generation error (i.e. on crystal structures). We evaluated several strategies, obtaining better results for those using a single docked pose per ligand than those using multiple docked poses per ligand.

CONCLUSIONS

Binding affinity prediction is often carried out on the docked pose of a known binder rather than its co-crystallised pose. Our results suggest than pose generation error is in general far less damaging for binding affinity prediction than it is currently believed. Another contribution of our study is the proposal of a procedure that largely corrects for this error. The resulting machine-learning scoring function is freely available at http://istar.cse.cuhk.edu.hk/rf-score-4.tgz and http://ballester.marseille.inserm.fr/rf-score-4.tgz .

摘要

背景

姿态生成误差通常被量化为对接软件生成的姿态几何结构与与所考虑蛋白质共结晶的同一分子的姿态几何结构之间的差异。令人惊讶的是，这种误差对结合亲和力预测的影响尚未在各种蛋白质-配体复合物中进行系统分析。

结果

与普遍观点相反，我们发现姿态生成误差通常对结合亲和力预测的准确性影响较小。对于较大的姿态生成误差也是如此，这不仅在机器学习评分函数中观察到，在经典评分函数如AutoDock Vina中也观察到。此外，我们提出了一种纠正该误差很大一部分的方法，该方法包括使用重新对接而非共结晶的姿态来校准评分函数。通过这种方式，直接学习了Vina生成的蛋白质-配体姿态与其结合亲和力之间的关系。结果，经过这种误差校正程序后的测试集性能更接近在没有姿态生成误差（即在晶体结构上）时预测结合亲和力的性能。我们评估了几种策略，发现对于每个配体使用单个对接姿态的策略比使用多个对接姿态的策略获得了更好的结果。

结论

结合亲和力预测通常是在已知结合剂的对接姿态而非其共结晶姿态上进行的。我们的结果表明，姿态生成误差对结合亲和力预测的损害通常远小于目前的认识。我们研究的另一个贡献是提出了一种在很大程度上纠正该误差的方法。由此产生的机器学习评分函数可在http://istar.cse.cuhk.edu.hk/rf-score-4.tgz和http://ballester.marseille.inserm.fr/rf-score-4.tgz免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16fe/5046193/2333e446b9eb/12859_2016_1169_Fig1_HTML.jpg

相似文献

Correcting the impact of docking pose generation error on binding affinity prediction.

BMC Bioinformatics. 2016 Sep 22;17(Suppl 11):308. doi: 10.1186/s12859-016-1169-4.

Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets.

Mol Inform. 2015 Feb;34(2-3):115-26. doi: 10.1002/minf.201400132. Epub 2015 Feb 12.

istar: a web platform for large-scale protein-ligand docking.

PLoS One. 2014 Jan 24;9(1):e85678. doi: 10.1371/journal.pone.0085678. eCollection 2014.

Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment.

J Chem Inf Model. 2018 Jan 22;58(1):119-133. doi: 10.1021/acs.jcim.7b00309. Epub 2017 Dec 20.

Comprehensive evaluation of ten docking programs on a diverse set of protein-ligand complexes: the prediction accuracy of sampling power and scoring power.

Phys Chem Chem Phys. 2016 May 14;18(18):12964-75. doi: 10.1039/c6cp01555g. Epub 2016 Apr 25.

Rescoring of docking poses under Occam's Razor: are there simpler solutions?

J Comput Aided Mol Des. 2018 Sep;32(9):877-888. doi: 10.1007/s10822-018-0155-5. Epub 2018 Sep 1.

Machine learning optimization of cross docking accuracy.

Comput Biol Chem. 2016 Jun;62:133-44. doi: 10.1016/j.compbiolchem.2016.04.005. Epub 2016 May 4.

Boosted neural networks scoring functions for accurate ligand docking and ranking.

J Bioinform Comput Biol. 2018 Apr;16(2):1850004. doi: 10.1142/S021972001850004X. Epub 2018 Feb 4.

Improving docking results via reranking of ensembles of ligand poses in multiple X-ray protein conformations with MM-GBSA.

J Chem Inf Model. 2014 Oct 27;54(10):2697-717. doi: 10.1021/ci5003735. Epub 2014 Sep 30.

Machine learning in computational docking.

Artif Intell Med. 2015 Mar;63(3):135-52. doi: 10.1016/j.artmed.2015.02.002. Epub 2015 Feb 16.

引用本文的文献

Spatio-temporal learning from molecular dynamics simulations for protein-ligand binding affinity prediction.

Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf429.

Docking-Informed Machine Learning for Kinome-wide Affinity Prediction.

J Chem Inf Model. 2024 Dec 23;64(24):9196-9204. doi: 10.1021/acs.jcim.4c01260. Epub 2024 Dec 10.

The Impact of Data on Structure-Based Binding Affinity Predictions Using Deep Neural Networks.

Int J Mol Sci. 2023 Nov 9;24(22):16120. doi: 10.3390/ijms242216120.

nCoVDock2: a docking server to predict the binding modes between COVID-19 targets and its potential ligands.

Nucleic Acids Res. 2023 Jul 5;51(W1):W365-W371. doi: 10.1093/nar/gkad414.

Synthesis of Novel 2,9-Disubstituted-6-morpholino Purine Derivatives Assisted by Virtual Screening and Modelling of Class I PI3K Isoforms.

Polymers (Basel). 2023 Mar 29;15(7):1703. doi: 10.3390/polym15071703.

SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation.

J Adv Res. 2023 Apr;46:135-147. doi: 10.1016/j.jare.2022.07.001. Epub 2022 Jul 25.

Multi-target mechanisms against coronaviruses of constituents from Chinese Dagang Tea revealed by experimental and docking studies.

J Ethnopharmacol. 2022 Oct 28;297:115528. doi: 10.1016/j.jep.2022.115528. Epub 2022 Jul 12.

Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions.

J Chem Inf Model. 2022 Jun 13;62(11):2696-2712. doi: 10.1021/acs.jcim.2c00485. Epub 2022 May 17.

Fragment-centric topographic mapping method guides the understanding of ABCG2-inhibitor interactions.

RSC Adv. 2019 Mar 8;9(14):7757-7766. doi: 10.1039/c8ra09789e. eCollection 2019 Mar 6.

Virtual Screening with Gnina 1.0.

Molecules. 2021 Dec 4;26(23):7369. doi: 10.3390/molecules26237369.

本文引用的文献

Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets.

Mol Inform. 2015 Feb;34(2-3):115-26. doi: 10.1002/minf.201400132. Epub 2015 Feb 12.

Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening.

Wiley Interdiscip Rev Comput Mol Sci. 2015 Nov-Dec;5(6):405-424. doi: 10.1002/wcms.1225. Epub 2015 Aug 28.

Low-Quality Structural and Interaction Data Improves Binding Affinity Prediction via Random Forest.

Molecules. 2015 Jun 12;20(6):10947-62. doi: 10.3390/molecules200610947.

Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: Cyscore as a case study.

BMC Bioinformatics. 2014 Aug 27;15(1):291. doi: 10.1186/1471-2105-15-291.

iview: an interactive WebGL visualizer for protein-ligand complex.

BMC Bioinformatics. 2014 Feb 25;15:56. doi: 10.1186/1471-2105-15-56.

Does a more precise chemical description of protein-ligand complexes lead to more accurate prediction of binding affinity?

J Chem Inf Model. 2014 Mar 24;54(3):944-55. doi: 10.1021/ci500091r. Epub 2014 Feb 20.

istar: a web platform for large-scale protein-ligand docking.

PLoS One. 2014 Jan 24;9(1):e85678. doi: 10.1371/journal.pone.0085678. eCollection 2014.

SFCscore(RF): a random forest-based scoring function for improved affinity prediction of protein-ligand complexes.

J Chem Inf Model. 2013 Aug 26;53(8):1923-33. doi: 10.1021/ci400120b. Epub 2013 Jun 10.

Hierarchical virtual screening for the discovery of new molecular scaffolds in antibacterial hit identification.

J R Soc Interface. 2012 Dec 7;9(77):3196-207. doi: 10.1098/rsif.2012.0569. Epub 2012 Aug 29.

Robust scoring functions for protein-ligand interactions with quantum chemical charge models.

J Chem Inf Model. 2011 Oct 24;51(10):2528-37. doi: 10.1021/ci200220v. Epub 2011 Oct 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

校正对接姿势生成误差对结合亲和力预测的影响。

Correcting the impact of docking pose generation error on binding affinity prediction.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献