• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Profile-QSAR 2.0:激酶虚拟筛选准确性与针对实际新型化合物的四浓度IC50相当。

Profile-QSAR 2.0: Kinase Virtual Screening Accuracy Comparable to Four-Concentration ICs for Realistically Novel Compounds.

作者信息

Martin Eric J, Polyakov Valery R, Tian Li, Perez Rolando C

机构信息

Novartis Institutes for Biomedical Research , 5300 Chiron Way, Emeryville, California 94608-2916, United States.

出版信息

J Chem Inf Model. 2017 Aug 28;57(8):2077-2088. doi: 10.1021/acs.jcim.7b00166. Epub 2017 Jul 26.

DOI:10.1021/acs.jcim.7b00166
PMID:28651433
Abstract

While conventional random forest regression (RFR) virtual screening models appear to have excellent accuracy on random held-out test sets, they prove lacking in actual practice. Analysis of 18 historical virtual screens showed that random test sets are far more similar to their training sets than are the compounds project teams actually order. A new, cluster-based "realistic" training/test set split, which mirrors the chemical novelty of real-life virtual screens, recapitulates the poor predictive power of RFR models in real projects. The original Profile-QSAR (pQSAR) method greatly broadened the domain of applicability over conventional models by using as independent variables a profile of activity predictions from all historical assays in a large protein family. However, the accuracy still fell short of experiment on realistic test sets. The improved "pQSAR 2.0" method replaces probabilities of activity from naïve Bayes categorical models at several thresholds with predicted ICs from RFR models. Unexpectedly, the high accuracy also requires removing the RFR model for the actual assay of interest from the independent variable profile. With these improvements, pQSAR 2.0 activity predictions are now statistically comparable to medium-throughput four-concentration IC measurements even on the realistic test set. Beyond the yes/no activity predictions from a typical high-throughput screen (HTS) or conventional virtual screen, these semiquantitative IC predictions allow for predicted potency, ligand efficiency, lipophilic efficiency, and selectivity against antitargets, greatly facilitating hitlist triaging and enabling virtual screening panels such as toxicity panels and overall promiscuity predictions.

摘要

虽然传统的随机森林回归(RFR)虚拟筛选模型在随机留出的测试集上似乎具有出色的准确性,但在实际应用中却存在不足。对18个历史虚拟筛选的分析表明,随机测试集与其训练集的相似性远远高于化合物项目团队实际订购的化合物。一种新的基于聚类的“现实”训练/测试集划分方法,反映了现实生活中虚拟筛选的化学新颖性,重现了RFR模型在实际项目中预测能力较差的情况。原始的Profile-QSAR(pQSAR)方法通过将来自一个大蛋白质家族中所有历史测定的活性预测概况用作自变量,大大拓宽了适用范围,超过了传统模型。然而,在现实测试集上,其准确性仍低于实验结果。改进后的“pQSAR 2.0”方法用RFR模型预测的IC值取代了朴素贝叶斯分类模型在几个阈值下的活性概率。出乎意料的是,要实现高精度还需要从自变量概况中去除针对感兴趣的实际测定的RFR模型。通过这些改进,即使在现实测试集上,pQSAR 2.0的活性预测在统计学上也与中通量四浓度IC测量相当。除了典型的高通量筛选(HTS)或传统虚拟筛选的是/否活性预测之外,这些半定量的IC预测还能得出预测的效力、配体效率、亲脂性效率以及对反靶标的选择性,极大地促进了命中列表的筛选,并实现了如毒性筛选和总体 promiscuity 预测等虚拟筛选面板。

相似文献

1
Profile-QSAR 2.0: Kinase Virtual Screening Accuracy Comparable to Four-Concentration ICs for Realistically Novel Compounds.Profile-QSAR 2.0:激酶虚拟筛选准确性与针对实际新型化合物的四浓度IC50相当。
J Chem Inf Model. 2017 Aug 28;57(8):2077-2088. doi: 10.1021/acs.jcim.7b00166. Epub 2017 Jul 26.
2
All-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration ICs for 8558 Novartis Assays.All-Assay-Max2 pQSAR:对 8558 项诺华测定法的活性预测,其准确性可媲美四浓度 ICs。
J Chem Inf Model. 2019 Oct 28;59(10):4450-4459. doi: 10.1021/acs.jcim.9b00375. Epub 2019 Sep 26.
3
Kinase-kernel models: accurate in silico screening of 4 million compounds across the entire human kinome.激酶-核模型:对整个人类激酶组中的 400 万种化合物进行准确的计算机筛选。
J Chem Inf Model. 2012 Jan 23;52(1):156-70. doi: 10.1021/ci200314j. Epub 2012 Jan 6.
4
Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity.谱定量构效关系(Profile-QSAR):一种新型的元定量构效关系方法,它结合了激酶家族的各项活性,可准确预测亲和力、选择性和细胞活性。
J Chem Inf Model. 2011 Aug 22;51(8):1942-56. doi: 10.1021/ci1005004. Epub 2011 Jul 19.
5
Quantitative structure-activity relationship analysis and virtual screening studies for identifying HDAC2 inhibitors from known HDAC bioactive chemical libraries.从已知的组蛋白去乙酰化酶(HDAC)生物活性化学文库中鉴定HDAC2抑制剂的定量构效关系分析和虚拟筛选研究。
SAR QSAR Environ Res. 2017 Mar;28(3):199-220. doi: 10.1080/1062936X.2017.1294198. Epub 2017 Feb 28.
6
Evaluation of QSAR Equations for Virtual Screening.QSAR 方程在虚拟筛选中的评估。
Int J Mol Sci. 2020 Oct 22;21(21):7828. doi: 10.3390/ijms21217828.
7
Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models.使用机器学习模型提高人激肽释放酶5抑制剂的虚拟筛选预测准确性。
Comput Biol Chem. 2017 Aug;69:110-119. doi: 10.1016/j.compbiolchem.2017.05.007. Epub 2017 May 29.
8
Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods.通过定量构效关系和机器学习方法对来自雌激素受体测定的大量环境化学物质进行二元分类。
J Chem Inf Model. 2013 Dec 23;53(12):3244-61. doi: 10.1021/ci400527b. Epub 2013 Dec 11.
9
Data-Driven Derivation of an "Informer Compound Set" for Improved Selection of Active Compounds in High-Throughput Screening.基于数据驱动的“信息化合物集”推导,提高高通量筛选中活性化合物的选择。
J Chem Inf Model. 2016 Sep 26;56(9):1622-30. doi: 10.1021/acs.jcim.6b00244. Epub 2016 Aug 16.
10
Ligand-based virtual screening and in silico design of new antimalarial compounds using nonstochastic and stochastic total and atom-type quadratic maps.基于配体的虚拟筛选以及使用非随机和随机全原子型及原子类型二次映射的新型抗疟化合物的计算机辅助设计。
J Chem Inf Model. 2005 Jul-Aug;45(4):1082-100. doi: 10.1021/ci050085t.

引用本文的文献

1
Graph neural processes for molecules: an evaluation on docking scores and strategies to improve generalization.用于分子的图神经过程:对接分数评估及提高泛化能力的策略
J Cheminform. 2024 Oct 23;16(1):115. doi: 10.1186/s13321-024-00904-2.
2
An Ensemble Spectral Prediction (ESP) model for metabolite annotation.用于代谢物注释的集成谱预测 (ESP) 模型。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae490.
3
AI is a viable alternative to high throughput screening: a 318-target study.人工智能是高通量筛选的可行替代方案:一项 318 靶点研究。
Sci Rep. 2024 Apr 2;14(1):7526. doi: 10.1038/s41598-024-54655-z.
4
Poor Generalization by Current Deep Learning Models for Predicting Binding Affinities of Kinase Inhibitors.当前用于预测激酶抑制剂结合亲和力的深度学习模型泛化能力较差。
bioRxiv. 2023 Sep 6:2023.09.04.556234. doi: 10.1101/2023.09.04.556234.
5
Large-Scale Modeling of Sparse Protein Kinase Activity Data.大规模稀疏蛋白激酶活性数据建模。
J Chem Inf Model. 2023 Jun 26;63(12):3688-3696. doi: 10.1021/acs.jcim.3c00132. Epub 2023 Jun 9.
6
Improvement of multi-task learning by data enrichment: application for drug discovery.通过数据扩充改进多任务学习:在药物发现中的应用
J Comput Aided Mol Des. 2023 Apr;37(4):183-200. doi: 10.1007/s10822-023-00500-w. Epub 2023 Mar 21.
7
Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system.基于结构和配体的影响中枢神经系统药物发现中的人工智能和机器学习方法
Mol Divers. 2023 Apr;27(2):959-985. doi: 10.1007/s11030-022-10489-3. Epub 2022 Jul 11.
8
Analysis of the benefits of imputation models over traditional QSAR models for toxicity prediction.插补模型相对于传统定量构效关系(QSAR)模型在毒性预测方面的优势分析。
J Cheminform. 2022 Jun 7;14(1):32. doi: 10.1186/s13321-022-00611-w.
9
Splitting chemical structure data sets for federated privacy-preserving machine learning.用于联邦隐私保护机器学习的化学结构数据集拆分
J Cheminform. 2021 Dec 7;13(1):96. doi: 10.1186/s13321-021-00576-2.
10
SWnet: a deep learning model for drug response prediction from cancer genomic signatures and compound chemical structures.SWnet:一种基于癌症基因组特征和化合物化学结构预测药物反应的深度学习模型。
BMC Bioinformatics. 2021 Sep 10;22(1):434. doi: 10.1186/s12859-021-04352-9.