• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

超越自由-威尔逊分析的范围:使用机器学习算法构建可解释的定量构效关系模型。

Beyond the scope of Free-Wilson analysis: building interpretable QSAR models with machine learning algorithms.

机构信息

Chemistry Innovation Center, Discovery Sciences, AstraZeneca R&D Mölndal, Sweden.

出版信息

J Chem Inf Model. 2013 Jun 24;53(6):1324-36. doi: 10.1021/ci4001376. Epub 2013 Jun 12.

DOI:10.1021/ci4001376
PMID:23789733
Abstract

A novel methodology was developed to build Free-Wilson like local QSAR models by combining R-group signatures and the SVM algorithm. Unlike Free-Wilson analysis this method is able to make predictions for compounds with R-groups not present in a training set. Eleven public data sets were chosen as test cases for comparing the performance of our new method with several other traditional modeling strategies, including Free-Wilson analysis. Our results show that the R-group signature SVM models achieve better prediction accuracy compared with Free-Wilson analysis in general. Moreover, the predictions of R-group signature models are also comparable to the models using ECFP6 fingerprints and signatures for the whole compound. Most importantly, R-group contributions to the SVM model can be obtained by calculating the gradient for R-group signatures. For most of the studied data sets, a significant correlation with that of a corresponding Free-Wilson analysis is shown. These results suggest that the R-group contribution can be used to interpret bioactivity data and highlight that the R-group signature based SVM modeling method is as interpretable as Free-Wilson analysis. Hence the signature SVM model can be a useful modeling tool for any drug discovery project.

摘要

开发了一种新的方法学,通过组合 R 基团特征和 SVM 算法来构建类似于 Free-Wilson 的局部 QSAR 模型。与 Free-Wilson 分析不同,该方法能够对训练集中不存在 R 基团的化合物进行预测。选择了十一个公共数据集作为测试案例,以比较我们的新方法与其他几种传统建模策略(包括 Free-Wilson 分析)的性能。我们的结果表明,与 Free-Wilson 分析相比,R 基团特征 SVM 模型通常具有更好的预测准确性。此外,R 基团特征模型的预测结果也与使用 ECFP6 指纹和整个化合物特征的模型相当。最重要的是,可以通过计算 R 基团特征的梯度来获得 SVM 模型中 R 基团的贡献。对于大多数研究的数据集中,与相应的 Free-Wilson 分析结果之间存在显著相关性。这些结果表明,R 基团贡献可用于解释生物活性数据,并强调基于 R 基团特征的 SVM 建模方法与 Free-Wilson 分析一样具有可解释性。因此,特征 SVM 模型可以成为任何药物发现项目的有用建模工具。

相似文献

1
Beyond the scope of Free-Wilson analysis: building interpretable QSAR models with machine learning algorithms.超越自由-威尔逊分析的范围:使用机器学习算法构建可解释的定量构效关系模型。
J Chem Inf Model. 2013 Jun 24;53(6):1324-36. doi: 10.1021/ci4001376. Epub 2013 Jun 12.
2
Beyond the scope of free-Wilson analysis. 2: Can distance encoded R-group fingerprints provide interpretable nonlinear models?超越自由意志分析的范围。2:距离编码的 R 基团指纹能否提供可解释的非线性模型?
J Chem Inf Model. 2014 Apr 28;54(4):1117-28. doi: 10.1021/ci500075q. Epub 2014 Apr 11.
3
Comparison of combinatorial clustering methods on pharmacological data sets represented by machine learning-selected real molecular descriptors.基于机器学习筛选的真实分子描述符的药理学数据集的组合聚类方法比较。
J Chem Inf Model. 2011 Dec 27;51(12):3036-49. doi: 10.1021/ci2000083. Epub 2011 Dec 9.
4
Drug discovery using support vector machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions.使用支持向量机进行药物发现。药物相似性、农用化学品相似性和酶抑制预测的案例研究。
J Chem Inf Comput Sci. 2003 Nov-Dec;43(6):2048-56. doi: 10.1021/ci0340916.
5
A comprehensive support vector machine binary hERG classification model based on extensive but biased end point hERG data sets.基于广泛但存在偏倚的终点 hERG 数据集的全面支持向量机二进制 hERG 分类模型。
Chem Res Toxicol. 2011 Jun 20;24(6):934-49. doi: 10.1021/tx200099j. Epub 2011 May 6.
6
Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity.谱定量构效关系(Profile-QSAR):一种新型的元定量构效关系方法,它结合了激酶家族的各项活性,可准确预测亲和力、选择性和细胞活性。
J Chem Inf Model. 2011 Aug 22;51(8):1942-56. doi: 10.1021/ci1005004. Epub 2011 Jul 19.
7
Adaptive variable-weighted support vector machine as optimized by particle swarm optimization algorithm with application of QSAR studies.基于粒子群优化算法优化的自适应变权重支持向量机及其在定量构效关系研究中的应用。
Talanta. 2011 Mar 15;84(1):13-8. doi: 10.1016/j.talanta.2010.11.039. Epub 2010 Nov 26.
8
Application of support vector machine (SVM) for prediction toxic activity of different data sets.支持向量机(SVM)在不同数据集毒性活性预测中的应用。
Toxicology. 2006 Jan 16;217(2-3):105-19. doi: 10.1016/j.tox.2005.08.019. Epub 2005 Oct 5.
9
Comparing the Influence of Simulated Experimental Errors on 12 Machine Learning Algorithms in Bioactivity Modeling Using 12 Diverse Data Sets.比较模拟实验误差对使用 12 个不同数据集的生物活性建模中 12 种机器学习算法的影响。
J Chem Inf Model. 2015 Jul 27;55(7):1413-25. doi: 10.1021/acs.jcim.5b00101. Epub 2015 Jun 18.
10
Prediction of P-glycoprotein substrates by a support vector machine approach.基于支持向量机方法的P-糖蛋白底物预测
J Chem Inf Comput Sci. 2004 Jul-Aug;44(4):1497-505. doi: 10.1021/ci049971e.

引用本文的文献

1
A context-based matched molecular pair analysis identifies structural transformations that reduce CYP1A2 inhibition.基于上下文的匹配分子对分析确定了可降低CYP1A2抑制作用的结构转变。
RSC Med Chem. 2025 May 2. doi: 10.1039/d4md01012d.
2
Development and Validation of Atomic Group Descriptors for Substituent Effects.取代基效应的原子基团描述符的开发与验证
J Comput Chem. 2025 May 30;46(14):e70131. doi: 10.1002/jcc.70131.
3
Molecular similarity: Theory, applications, and perspectives.分子相似性:理论、应用与展望。
Artif Intell Chem. 2024 Dec;2(2). doi: 10.1016/j.aichem.2024.100077. Epub 2024 Aug 31.
4
Deconstructing Markush: Improving the R&D Efficiency Using Library Selection in Early Drug Discovery.解构马库什结构:在早期药物发现中通过库筛选提高研发效率。
Pharmaceuticals (Basel). 2022 Sep 18;15(9):1159. doi: 10.3390/ph15091159.
5
Leveraging structural and 2D-QSAR to investigate the role of functional group substitutions, conserved surface residues and desolvation in triggering the small molecule-induced dimerization of hPD-L1.利用结构和二维定量构效关系研究官能团取代、保守表面残基和去溶剂化在触发小分子诱导的人程序性死亡配体1(hPD-L1)二聚化中的作用。
BMC Chem. 2022 Jun 27;16(1):49. doi: 10.1186/s13065-022-00842-w.
6
Machine learning methods in chemoinformatics.化学信息学中的机器学习方法。
Wiley Interdiscip Rev Comput Mol Sci. 2014 Sep 1;4(5):468-481. doi: 10.1002/wcms.1183.
7
Self organising hypothesis networks: a new approach for representing and structuring SAR knowledge.自组织假设网络:一种表示和构建合成孔径雷达知识的新方法。
J Cheminform. 2014 May 8;6:21. doi: 10.1186/1758-2946-6-21. eCollection 2014.