Suppr超能文献

iScore:一种用于从头药物发现的基于机器学习的评分函数。

iScore: A ML-Based Scoring Function for De Novo Drug Discovery.

作者信息

Mahdizadeh Sayyed Jalil, Eriksson Leif A

机构信息

Department of Chemistry and Molecular Biology, University of Gothenburg, Göteborg 405 30, Sweden.

出版信息

J Chem Inf Model. 2025 Mar 24;65(6):2759-2772. doi: 10.1021/acs.jcim.4c02192. Epub 2025 Mar 4.

Abstract

In the quest for accelerating de novo drug discovery, the development of efficient and accurate scoring functions represents a fundamental challenge. This study introduces iScore, a novel machine learning (ML)-based scoring function designed to predict the binding affinity of protein-ligand complexes with remarkable speed and precision. Uniquely, iScore circumvents the conventional reliance on explicit knowledge of protein-ligand interactions and a full picture of atomic contacts, instead leveraging a set of ligand and binding pocket descriptors to directly evaluate binding affinity. This approach enables skipping the inefficient and slow conformational sampling stage, thereby enabling the rapid screening of ultrahuge molecular libraries, a crucial advancement given the practically infinite dimensions of chemical space. iScore was rigorously trained and validated using the PDBbind 2020 refined set, CASF 2016, CSAR NRC-HiQ Set1/2, DUD-E, and target fishing data sets, employing three distinct ML methodologies: Deep neural network (iScore-DNN), random forest (iScore-RF), and eXtreme gradient boosting (iScore-XGB). A hybrid model, iScore-Hybrid, was subsequently developed to incorporate the strengths of these individual base learners. The hybrid model demonstrated a Pearson correlation coefficient () of 0.78 and a root-mean-square error (RMSE) of 1.23 in cross-validation, outperforming the individual base learners and establishing new benchmarks for scoring power ( = 0.814, RMSE = 1.34), ranking power (ρ = 0.705), and screening power (success rate at top 10% = 73.7%). Moreover, iScore-Hybrid demonstrated great performance in the target fishing benchmarking study.

摘要

在加速从头药物发现的探索中,开发高效且准确的评分函数是一项根本性挑战。本研究介绍了iScore,这是一种基于机器学习(ML)的新型评分函数,旨在以卓越的速度和精度预测蛋白质-配体复合物的结合亲和力。独特的是,iScore规避了对蛋白质-配体相互作用的明确知识和原子接触全貌的传统依赖,而是利用一组配体和结合口袋描述符直接评估结合亲和力。这种方法能够跳过低效且缓慢的构象采样阶段,从而能够快速筛选超大型分子库,鉴于化学空间几乎无限的维度,这是一项关键进展。iScore使用PDBbind 2020精制集、CASF 2016、CSAR NRC-HiQ Set1/2、DUD-E和目标垂钓数据集,采用三种不同的ML方法进行了严格训练和验证:深度神经网络(iScore-DNN)、随机森林(iScore-RF)和极端梯度提升(iScore-XGB)。随后开发了一种混合模型iScore-Hybrid,以融合这些单个基础学习器的优势。在交叉验证中,混合模型的皮尔逊相关系数()为0.78,均方根误差(RMSE)为1.23,优于单个基础学习器,并在评分能力(= 0.814,RMSE = 1.34)、排名能力(ρ = 0.705)和筛选能力(前10%的成功率 = 73.7%)方面建立了新的基准。此外,iScore-Hybrid在目标垂钓基准测试研究中表现出色。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验