D3R对蛋白质-配体评分的机器学习前瞻性评估。

A D3R prospective evaluation of machine learning for protein-ligand scoring.

作者信息

Sunseri Jocelyn, Ragoza Matthew, Collins Jasmine, Koes David Ryan

机构信息

Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Suite 3064, Biomedical Science Tower 3 (BST3), 3501 Fifth Avenue, Pittsburgh, PA, 15260, USA.

出版信息

J Comput Aided Mol Des. 2016 Sep;30(9):761-771. doi: 10.1007/s10822-016-9960-x. Epub 2016 Sep 3.

DOI:10.1007/s10822-016-9960-x

PMID:27592011

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5079830/

Abstract

We assess the performance of several machine learning-based scoring methods at protein-ligand pose prediction, virtual screening, and binding affinity prediction. The methods and the manner in which they were trained make them sufficiently diverse to evaluate the utility of various strategies for training set curation and binding pose generation, but they share a novel approach to classification in the context of protein-ligand scoring. Rather than explicitly using structural data such as affinity values or information extracted from crystal binding poses for training, we instead exploit the abundance of data available from high-throughput screening to approach the problem as one of discriminating binders from non-binders. We evaluate the performance of our various scoring methods in the 2015 D3R Grand Challenge and find that although the merits of some features of our approach remain inconclusive, our scoring methods performed comparably to a state-of-the-art scoring function that was fit to binding affinity data.

摘要

我们评估了几种基于机器学习的评分方法在蛋白质-配体构象预测、虚拟筛选和结合亲和力预测方面的性能。这些方法及其训练方式足够多样，能够评估各种训练集筛选和结合构象生成策略的效用，但它们在蛋白质-配体评分的背景下采用了一种新颖的分类方法。我们不是明确使用诸如亲和力值或从晶体结合构象中提取的信息等结构数据进行训练，而是利用高通量筛选中可用的大量数据，将该问题作为区分结合剂和非结合剂的问题来处理。我们在2015年D3R大挑战中评估了各种评分方法的性能，发现尽管我们方法的某些特征的优点尚无定论，但我们的评分方法与基于结合亲和力数据拟合的最先进评分函数表现相当。

相似文献

A D3R prospective evaluation of machine learning for protein-ligand scoring.D3R对蛋白质-配体评分的机器学习前瞻性评估。

J Comput Aided Mol Des. 2016 Sep;30(9):761-771. doi: 10.1007/s10822-016-9960-x. Epub 2016 Sep 3.

Prospective evaluation of shape similarity based pose prediction method in D3R Grand Challenge 2015.基于形状相似性的姿态预测方法在2015年D3R大挑战中的前瞻性评估。

J Comput Aided Mol Des. 2016 Sep;30(9):685-693. doi: 10.1007/s10822-016-9931-2. Epub 2016 Aug 2.

Machine learning in computational docking.计算对接中的机器学习。

Artif Intell Med. 2015 Mar;63(3):135-52. doi: 10.1016/j.artmed.2015.02.002. Epub 2015 Feb 16.

Large scale free energy calculations for blind predictions of protein-ligand binding: the D3R Grand Challenge 2015.用于蛋白质-配体结合盲预测的大规模自由能计算：2015年D3R大挑战

J Comput Aided Mol Des. 2016 Sep;30(9):743-751. doi: 10.1007/s10822-016-9952-x. Epub 2016 Aug 25.

Improved pose and affinity predictions using different protocols tailored on the basis of data availability.基于数据可用性定制不同协议，改进姿态和亲和力预测。

J Comput Aided Mol Des. 2016 Sep;30(9):817-828. doi: 10.1007/s10822-016-9982-4. Epub 2016 Oct 6.

Predicting binding poses and affinities for protein - ligand complexes in the 2015 D3R Grand Challenge using a physical model with a statistical parameter estimation.使用具有统计参数估计的物理模型预测2015年D3R大挑战中蛋白质-配体复合物的结合构象和亲和力。

J Comput Aided Mol Des. 2016 Sep;30(9):791-804. doi: 10.1007/s10822-016-9976-2. Epub 2016 Oct 7.

Optimal strategies for virtual screening of induced-fit and flexible target in the 2015 D3R Grand Challenge.2015年D3R大挑战中诱导契合和柔性靶点虚拟筛选的优化策略

J Comput Aided Mol Des. 2016 Sep;30(9):695-706. doi: 10.1007/s10822-016-9941-0. Epub 2016 Aug 29.

Interaction with specific HSP90 residues as a scoring function: validation in the D3R Grand Challenge 2015.作为一种评分函数与特定热休克蛋白90（HSP90）残基的相互作用：2015年D3R大挑战中的验证

J Comput Aided Mol Des. 2016 Sep;30(9):731-742. doi: 10.1007/s10822-016-9943-y. Epub 2016 Aug 22.

D3R grand challenge 2015: Evaluation of protein-ligand pose and affinity predictions.2015年D3R重大挑战：蛋白质-配体构象与亲和力预测评估

J Comput Aided Mol Des. 2016 Sep;30(9):651-668. doi: 10.1007/s10822-016-9946-8. Epub 2016 Sep 30.

Improving binding mode and binding affinity predictions of docking by ligand-based search of protein conformations: evaluation in D3R grand challenge 2015.通过基于配体的蛋白质构象搜索改进对接的结合模式和结合亲和力预测：2015年D3R大挑战中的评估

J Comput Aided Mol Des. 2017 Aug;31(8):689-699. doi: 10.1007/s10822-017-0038-1. Epub 2017 Jul 1.

引用本文的文献

Poisson-Boltzmann-based machine learning model for electrostatic analysis.基于泊松-玻尔兹曼的静电分析机器学习模型。

Biophys J. 2024 Sep 3;123(17):2807-2814. doi: 10.1016/j.bpj.2024.02.008. Epub 2024 Feb 15.

Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions.利用 Delta 机器学习改进蛋白质配体打分函数的评分-排名-筛选性能。

J Chem Inf Model. 2022 Jun 13;62(11):2696-2712. doi: 10.1021/acs.jcim.2c00485. Epub 2022 May 17.

Virtual Screening with Gnina 1.0.Gnina 1.0 虚拟筛选。

Molecules. 2021 Dec 4;26(23):7369. doi: 10.3390/molecules26237369.

Machine learning and AI-based approaches for bioactive ligand discovery and GPCR-ligand recognition.基于机器学习和人工智能的生物活性配体发现和 GPCR 配体识别方法。

Methods. 2020 Aug 1;180:89-110. doi: 10.1016/j.ymeth.2020.06.016. Epub 2020 Jul 6.

Incorporating Explicit Water Molecules and Ligand Conformation Stability in Machine-Learning Scoring Functions.将显式水分子和配体构象稳定性纳入机器学习打分函数中。

J Chem Inf Model. 2019 Nov 25;59(11):4540-4549. doi: 10.1021/acs.jcim.9b00645. Epub 2019 Oct 31.

Prediction of various freshness indicators in fish fillets by one multispectral imaging system.利用一种多光谱成像系统预测鱼片的各种新鲜度指标。

Sci Rep. 2019 Oct 11;9(1):14704. doi: 10.1038/s41598-019-51264-z.

Improving small molecule virtual screening strategies for the next generation of therapeutics.改进小分子虚拟筛选策略，以用于下一代疗法。

Curr Opin Chem Biol. 2018 Jun;44:87-92. doi: 10.1016/j.cbpa.2018.06.006. Epub 2018 Jun 17.

Rama: a machine learning approach for ribosomal protein prediction in plants.拉玛：一种植物核糖体蛋白预测的机器学习方法。

Sci Rep. 2017 Nov 24;7(1):16273. doi: 10.1038/s41598-017-16322-4.

Docking of small molecules to farnesoid X receptors using AutoDock Vina with the Convex-PL potential: lessons learned from D3R Grand Challenge 2.使用 AutoDock Vina 和 Convex-PL 势能对接法对接法尼醇 X 受体上的小分子：从 D3R 大挑战 2 中学到的经验教训。

J Comput Aided Mol Des. 2018 Jan;32(1):151-162. doi: 10.1007/s10822-017-0062-1. Epub 2017 Sep 14.

Geminivirus data warehouse: a database enriched with machine learning approaches.双生病毒数据仓库：一个通过机器学习方法丰富的数据库。

BMC Bioinformatics. 2017 May 5;18(1):240. doi: 10.1186/s12859-017-1646-4.

本文引用的文献

OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins.OPLS3：一种提供广泛覆盖药物样小分子和蛋白质的力场。

J Chem Theory Comput. 2016 Jan 12;12(1):281-96. doi: 10.1021/acs.jctc.5b00864. Epub 2015 Dec 1.

Deep learning.深度学习。

Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

Machine-learning scoring functions for identifying native poses of ligands docked to known and novel proteins.用于识别对接至已知和新型蛋白质的配体天然构象的机器学习评分函数。

BMC Bioinformatics. 2015;16 Suppl 6(Suppl 6):S3. doi: 10.1186/1471-2105-16-S6-S3. Epub 2015 Apr 17.

Machine-learning techniques applied to antibacterial drug discovery.应用于抗菌药物发现的机器学习技术。

Chem Biol Drug Des. 2015 Jan;85(1):14-21. doi: 10.1111/cbdd.12423.

Beware of machine learning-based scoring functions-on the danger of developing black boxes.警惕基于机器学习的评分函数——开发黑盒的危险。

J Chem Inf Model. 2014 Oct 27;54(10):2807-15. doi: 10.1021/ci500406k. Epub 2014 Sep 24.

QSAR modeling: where have you been? Where are you going to?定量构效关系模型：你从何处来？你将往何处去？

J Med Chem. 2014 Jun 26;57(12):4977-5010. doi: 10.1021/jm4004285. Epub 2014 Jan 6.

The ChEMBL bioactivity database: an update.《ChEMBL 生物活性数据库更新》

Nucleic Acids Res. 2014 Jan;42(Database issue):D1083-90. doi: 10.1093/nar/gkt1031. Epub 2013 Nov 7.

Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules.化学信息学中的深度架构和深度学习：药物样分子水溶解度的预测。

J Chem Inf Model. 2013 Jul 22;53(7):1563-75. doi: 10.1021/ci400187y. Epub 2013 Jul 2.

SFCscore(RF): a random forest-based scoring function for improved affinity prediction of protein-ligand complexes.SFCscore（RF）：一种基于随机森林的打分函数，可提高蛋白-配体复合物亲和力预测的准确性。

J Chem Inf Model. 2013 Aug 26;53(8):1923-33. doi: 10.1021/ci400120b. Epub 2013 Jun 10.

Predicting ligand binding modes from neural networks trained on protein-ligand interaction fingerprints.从基于蛋白质-配体相互作用指纹的神经网络中预测配体结合模式。

J Chem Inf Model. 2013 Apr 22;53(4):763-72. doi: 10.1021/ci300200r. Epub 2013 Mar 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。