Fujimoto Kazuhiro J, Minami Shota, Yanai Takeshi
Institute of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan.
Department of Chemistry, Graduate School of Science, Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan.
ACS Omega. 2022 May 25;7(22):19030-19039. doi: 10.1021/acsomega.2c02822. eCollection 2022 Jun 7.
We propose a novel machine-learning-based scoring function for drug discovery that incorporates ligand and protein structural information into a knowledge-based PMF score. Molecular docking, a simulation method for structure-based drug design (SBDD), is expected to reduce the enormous costs associated with conventional experimental methods in terms of rational drug discovery. Molecular docking has two main purposes: to predict ligand-binding structures for target proteins and to predict protein-ligand binding affinity. Currently available programs of molecular docking offer an accurate prediction of ligand binding structures for many systems. However, the accurate prediction of binding affinity remains challenging. In this study, we developed a new scoring function that incorporates fingerprints representing ligand and protein structures as descriptors in the PMF score. Here, regression analysis of the scoring function was performed using the following machine learning techniques: least absolute shrinkage and selection operator (LASSO) and light gradient boosting machine (LightGBM). The results on a test data set showed that the binding affinity delivered by the newly developed scoring function has a Pearson correlation coefficient of 0.79 with the experimental value, which surpasses that of the conventional scoring functions. Further analysis provided a chemical understanding of the descriptors that contributed significantly to the improvement in prediction accuracy. Our approach and findings are useful for rational drug discovery.
我们提出了一种用于药物发现的基于机器学习的新型评分函数,该函数将配体和蛋白质结构信息纳入基于知识的PMF评分中。分子对接是一种基于结构的药物设计(SBDD)模拟方法,有望在合理药物发现方面降低与传统实验方法相关的巨大成本。分子对接有两个主要目的:预测靶蛋白的配体结合结构以及预测蛋白质-配体结合亲和力。目前可用的分子对接程序能对许多系统的配体结合结构进行准确预测。然而,准确预测结合亲和力仍然具有挑战性。在本研究中,我们开发了一种新的评分函数,该函数将代表配体和蛋白质结构的指纹作为描述符纳入PMF评分中。在此,使用以下机器学习技术对评分函数进行回归分析:最小绝对收缩和选择算子(LASSO)以及轻梯度提升机(LightGBM)。测试数据集的结果表明,新开发的评分函数给出的结合亲和力与实验值的皮尔逊相关系数为0.79,超过了传统评分函数。进一步分析对显著提高预测准确性的描述符进行了化学解读。我们的方法和发现对合理药物发现很有用。