结合配体和蛋白质指纹图谱的基于机器学习和知识的评分函数。

Machine-Learning- and Knowledge-Based Scoring Functions Incorporating Ligand and Protein Fingerprints.

作者信息

Fujimoto Kazuhiro J, Minami Shota, Yanai Takeshi

机构信息

Institute of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan.

Department of Chemistry, Graduate School of Science, Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan.

出版信息

ACS Omega. 2022 May 25;7(22):19030-19039. doi: 10.1021/acsomega.2c02822. eCollection 2022 Jun 7.

DOI:10.1021/acsomega.2c02822

PMID:35694525

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9178954/

Abstract

We propose a novel machine-learning-based scoring function for drug discovery that incorporates ligand and protein structural information into a knowledge-based PMF score. Molecular docking, a simulation method for structure-based drug design (SBDD), is expected to reduce the enormous costs associated with conventional experimental methods in terms of rational drug discovery. Molecular docking has two main purposes: to predict ligand-binding structures for target proteins and to predict protein-ligand binding affinity. Currently available programs of molecular docking offer an accurate prediction of ligand binding structures for many systems. However, the accurate prediction of binding affinity remains challenging. In this study, we developed a new scoring function that incorporates fingerprints representing ligand and protein structures as descriptors in the PMF score. Here, regression analysis of the scoring function was performed using the following machine learning techniques: least absolute shrinkage and selection operator (LASSO) and light gradient boosting machine (LightGBM). The results on a test data set showed that the binding affinity delivered by the newly developed scoring function has a Pearson correlation coefficient of 0.79 with the experimental value, which surpasses that of the conventional scoring functions. Further analysis provided a chemical understanding of the descriptors that contributed significantly to the improvement in prediction accuracy. Our approach and findings are useful for rational drug discovery.

摘要

我们提出了一种用于药物发现的基于机器学习的新型评分函数，该函数将配体和蛋白质结构信息纳入基于知识的PMF评分中。分子对接是一种基于结构的药物设计（SBDD）模拟方法，有望在合理药物发现方面降低与传统实验方法相关的巨大成本。分子对接有两个主要目的：预测靶蛋白的配体结合结构以及预测蛋白质-配体结合亲和力。目前可用的分子对接程序能对许多系统的配体结合结构进行准确预测。然而，准确预测结合亲和力仍然具有挑战性。在本研究中，我们开发了一种新的评分函数，该函数将代表配体和蛋白质结构的指纹作为描述符纳入PMF评分中。在此，使用以下机器学习技术对评分函数进行回归分析：最小绝对收缩和选择算子（LASSO）以及轻梯度提升机（LightGBM）。测试数据集的结果表明，新开发的评分函数给出的结合亲和力与实验值的皮尔逊相关系数为0.79，超过了传统评分函数。进一步分析对显著提高预测准确性的描述符进行了化学解读。我们的方法和发现对合理药物发现很有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c79e/9178954/4782d99367eb/ao2c02822_0002.jpg

相似文献

Machine-Learning- and Knowledge-Based Scoring Functions Incorporating Ligand and Protein Fingerprints.结合配体和蛋白质指纹图谱的基于机器学习和知识的评分函数。

ACS Omega. 2022 May 25;7(22):19030-19039. doi: 10.1021/acsomega.2c02822. eCollection 2022 Jun 7.

Boosted neural networks scoring functions for accurate ligand docking and ranking.用于精确配体对接和排序的增强神经网络评分函数。

J Bioinform Comput Biol. 2018 Apr;16(2):1850004. doi: 10.1142/S021972001850004X. Epub 2018 Feb 4.

Machine learning in computational docking.计算对接中的机器学习。

Artif Intell Med. 2015 Mar;63(3):135-52. doi: 10.1016/j.artmed.2015.02.002. Epub 2015 Feb 16.

Prediction of protein-ligand binding affinities using multiple instance learning.使用多实例学习预测蛋白-配体结合亲和力。

J Mol Graph Model. 2010 Nov;29(3):492-7. doi: 10.1016/j.jmgm.2010.09.006. Epub 2010 Sep 29.

New machine learning and physics-based scoring functions for drug discovery.新药研发中的新型机器学习和基于物理的打分函数。

Sci Rep. 2021 Feb 4;11(1):3198. doi: 10.1038/s41598-021-82410-1.

Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment.用于预测配体结合构象和亲和力以及进行筛选富集的任务特定评分函数。

J Chem Inf Model. 2018 Jan 22;58(1):119-133. doi: 10.1021/acs.jcim.7b00309. Epub 2017 Dec 20.

Enhance the performance of current scoring functions with the aid of 3D protein-ligand interaction fingerprints.借助三维蛋白质-配体相互作用指纹图谱提高当前评分函数的性能。

BMC Bioinformatics. 2017 Jul 18;18(1):343. doi: 10.1186/s12859-017-1750-5.

Application of Machine Learning Techniques to Predict Binding Affinity for Drug Targets: A Study of Cyclin-Dependent Kinase 2.机器学习技术在预测药物靶点结合亲和力中的应用：以细胞周期蛋白依赖性激酶 2 为例的研究。

Curr Med Chem. 2021;28(2):253-265. doi: 10.2174/2213275912666191102162959.

Computational Prediction of Binding Affinity for CDK2-ligand Complexes. A Protein Target for Cancer Drug Discovery.CDK2配体复合物结合亲和力的计算预测。癌症药物发现的蛋白质靶点。

Curr Med Chem. 2022;29(14):2438-2455. doi: 10.2174/0929867328666210806105810.

An Overview of Scoring Functions Used for Protein-Ligand Interactions in Molecular Docking.用于分子对接中蛋白质-配体相互作用的评分函数概述。

Interdiscip Sci. 2019 Jun;11(2):320-328. doi: 10.1007/s12539-019-00327-w. Epub 2019 Mar 15.

引用本文的文献

The physics-AI dialogue in drug design.药物设计中的物理与人工智能对话。

RSC Med Chem. 2025 Jan 23;16(4):1499-1515. doi: 10.1039/d4md00869c. eCollection 2025 Apr 16.

Computer-Aided Drug Design and Drug Discovery: A Prospective Analysis.计算机辅助药物设计与药物发现：前瞻性分析

Pharmaceuticals (Basel). 2023 Dec 22;17(1):22. doi: 10.3390/ph17010022.

SS-GNN: A Simple-Structured Graph Neural Network for Affinity Prediction.SS-GNN：一种用于亲和力预测的结构简单的图神经网络。

ACS Omega. 2023 Jun 15;8(25):22496-22507. doi: 10.1021/acsomega.3c00085. eCollection 2023 Jun 27.

Synthesis, molecular docking, and binding Gibbs free energy calculation of β-nitrostyrene derivatives: Potential inhibitors of SARS-CoV-2 3CL protease.β-硝基苯乙烯衍生物的合成、分子对接及结合吉布斯自由能计算：新型冠状病毒3CL蛋白酶的潜在抑制剂

J Mol Struct. 2023 Jul 15;1284:135409. doi: 10.1016/j.molstruc.2023.135409. Epub 2023 Mar 23.

本文引用的文献

Molecular persistent spectral image (Mol-PSI) representation for machine learning models in drug design.用于药物设计中机器学习模型的分子持久谱图像（Mol-PSI）表示。

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab527.

OnionNet-2: A Convolutional Neural Network Model for Predicting Protein-Ligand Binding Affinity Based on Residue-Atom Contacting Shells.洋葱网络-2：一种基于残基-原子接触壳预测蛋白质-配体结合亲和力的卷积神经网络模型。

Front Chem. 2021 Oct 27;9:753002. doi: 10.3389/fchem.2021.753002. eCollection 2021.

Persistent spectral-based machine learning (PerSpect ML) for protein-ligand binding affinity prediction.用于蛋白质-配体结合亲和力预测的基于持久光谱的机器学习（PerSpect ML）。

Sci Adv. 2021 May 7;7(19). doi: 10.1126/sciadv.abc5329. Print 2021 May.

New machine learning and physics-based scoring functions for drug discovery.新药研发中的新型机器学习和基于物理的打分函数。

Sci Rep. 2021 Feb 4;11(1):3198. doi: 10.1038/s41598-021-82410-1.

Extended connectivity interaction features: improving binding affinity prediction through chemical description.扩展连接相互作用特征：通过化学描述提高结合亲和力预测。

Bioinformatics. 2021 Jun 16;37(10):1376-1382. doi: 10.1093/bioinformatics/btaa982.

A reference map of potential determinants for the human serum metabolome.人类血清代谢组潜在决定因素参考图谱。

Nature. 2020 Dec;588(7836):135-140. doi: 10.1038/s41586-020-2896-2. Epub 2020 Nov 11.

Early detection of type 2 diabetes mellitus using machine learning-based prediction models.使用基于机器学习的预测模型进行 2 型糖尿病的早期检测。

Sci Rep. 2020 Jul 20;10(1):11981. doi: 10.1038/s41598-020-68771-z.

Clinically Applicable AI System for Accurate Diagnosis, Quantitative Measurements, and Prognosis of COVID-19 Pneumonia Using Computed Tomography.利用计算机断层扫描技术对 COVID-19 肺炎进行准确诊断、定量测量和预后的临床适用人工智能系统。

Cell. 2020 Jun 11;181(6):1423-1433.e11. doi: 10.1016/j.cell.2020.04.045. Epub 2020 May 4.

Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009-2018.2009-2018 年新药推向市场所需的研发投资估算。

JAMA. 2020 Mar 3;323(9):844-853. doi: 10.1001/jama.2020.1166.

Learning from the ligand: using ligand-based features to improve binding affinity prediction.从配体中学习：利用基于配体的特征来提高结合亲和力预测。

Bioinformatics. 2020 Feb 1;36(3):758-764. doi: 10.1093/bioinformatics/btz665.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

结合配体和蛋白质指纹图谱的基于机器学习和知识的评分函数。

Machine-Learning- and Knowledge-Based Scoring Functions Incorporating Ligand and Protein Fingerprints.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献