• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于核的偏最小二乘法:在基于指纹的 QSAR 中的应用及模型可视化。

Kernel-based partial least squares: application to fingerprint-based QSAR with model visualization.

机构信息

Schrödinger, Inc., 120 West 45th Street, New York, New York 10036, United States.

出版信息

J Chem Inf Model. 2013 Sep 23;53(9):2312-21. doi: 10.1021/ci400250c. Epub 2013 Aug 19.

DOI:10.1021/ci400250c
PMID:23901898
Abstract

Numerous regression-based and machine learning techniques are available for the development of linear and nonlinear QSAR models that can accurately predict biological endpoints. Such tools can be quite powerful in the hands of an experienced modeler, but too frequently a disconnect remains between the modeler and project chemist because the resulting QSAR models are effectively black boxes. As a result, learning methods that yield models that can be visualized in the context of chemical structures are in high demand. In this work, we combine direct kernel-based PLS with Canvas 2D fingerprints to arrive at predictive QSAR models that can be projected onto the atoms of a chemical structure, allowing immediate identification of favorable and unfavorable characteristics. The method is validated using binding affinities for ligands from 10 different protein targets covering 7 distinct protein families. Models with significant predictive ability (test set Q(2) > 0.5) are obtained for 6 of 10 data sets, and fingerprints are shown to consistently outperform large collections of classical physicochemical and topological descriptors. In addition, we demonstrate how a simple bootstrapping technique may be employed to obtain uncertainties that provide meaningful estimates of prediction accuracy.

摘要

有许多基于回归和机器学习的技术可用于开发线性和非线性 QSAR 模型,这些模型可以准确地预测生物学终点。在经验丰富的建模者手中,这些工具可能非常强大,但由于得到的 QSAR 模型实际上是黑盒,建模者和项目化学家之间仍然存在脱节。因此,人们迫切需要能够在化学结构背景下可视化模型的学习方法。在这项工作中,我们将直接基于核的 PLS 与 Canvas 2D 指纹相结合,得出可以投射到化学结构原子上的预测性 QSAR 模型,从而可以立即识别有利和不利的特征。该方法使用来自 10 个不同蛋白质靶标覆盖 7 个不同蛋白质家族的配体的结合亲和力进行验证。对于 10 个数据集的 6 个数据集,获得了具有显著预测能力的模型(测试集 Q(2) > 0.5),并且指纹始终优于大量经典物理化学和拓扑描述符的集合。此外,我们还展示了如何使用简单的自举技术来获得不确定性,从而可以对预测准确性进行有意义的估计。

相似文献

1
Kernel-based partial least squares: application to fingerprint-based QSAR with model visualization.基于核的偏最小二乘法:在基于指纹的 QSAR 中的应用及模型可视化。
J Chem Inf Model. 2013 Sep 23;53(9):2312-21. doi: 10.1021/ci400250c. Epub 2013 Aug 19.
2
Combinatorial QSAR modeling of specificity and subtype selectivity of ligands binding to serotonin receptors 5HT1E and 5HT1F.与5-羟色胺受体5HT1E和5HT1F结合的配体特异性和亚型选择性的组合定量构效关系建模
J Chem Inf Model. 2008 May;48(5):997-1013. doi: 10.1021/ci700404c. Epub 2008 May 10.
3
Predictive QSAR modeling of HIV reverse transcriptase inhibitor TIBO derivatives.HIV逆转录酶抑制剂替博(TIBO)衍生物的预测性定量构效关系建模
Eur J Med Chem. 2009 Apr;44(4):1509-24. doi: 10.1016/j.ejmech.2008.07.020. Epub 2008 Jul 24.
4
Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity.谱定量构效关系(Profile-QSAR):一种新型的元定量构效关系方法,它结合了激酶家族的各项活性,可准确预测亲和力、选择性和细胞活性。
J Chem Inf Model. 2011 Aug 22;51(8):1942-56. doi: 10.1021/ci1005004. Epub 2011 Jul 19.
5
A comparative QSAR study using CoMFA, HQSAR, and FRED/SKEYS paradigms for estrogen receptor binding affinities of structurally diverse compounds.一项使用比较分子场分析(CoMFA)、全息定量构效关系(HQSAR)以及FRED/SKEYS范式对结构多样的化合物的雌激素受体结合亲和力进行的比较定量构效关系研究。
J Chem Inf Comput Sci. 2004 Mar-Apr;44(2):758-65. doi: 10.1021/ci0342526.
6
Fuzzy tricentric pharmacophore fingerprints. 2. Application of topological fuzzy pharmacophore triplets in quantitative structure-activity relationships.模糊三中心药效团指纹图谱。2. 拓扑模糊药效团三联体在定量构效关系中的应用。
J Chem Inf Model. 2008 Feb;48(2):409-25. doi: 10.1021/ci7003237. Epub 2008 Feb 7.
7
Application of validated QSAR models of D1 dopaminergic antagonists for database mining.经验证的D1多巴胺能拮抗剂定量构效关系模型在数据库挖掘中的应用。
J Med Chem. 2005 Nov 17;48(23):7322-32. doi: 10.1021/jm049116m.
8
Toward the prediction of class I and II mouse major histocompatibility complex-peptide-binding affinity: in silico bioinformatic step-by-step guide using quantitative structure-activity relationships.迈向I类和II类小鼠主要组织相容性复合体-肽结合亲和力的预测:使用定量构效关系的计算机生物信息学逐步指南
Methods Mol Biol. 2007;409:227-45. doi: 10.1007/978-1-60327-118-9_16.
9
In silico binary classification QSAR models based on 4D-fingerprints and MOE descriptors for prediction of hERG blockage.基于 4D-指纹和 MOE 描述符的 hERG 阻断虚拟二进制分类 QSAR 模型预测。
J Chem Inf Model. 2010 Jul 26;50(7):1304-18. doi: 10.1021/ci100081j.
10
Development of QSAR models to predict and interpret the biological activity of artemisinin analogues.用于预测和解释青蒿素类似物生物活性的定量构效关系(QSAR)模型的开发。
J Chem Inf Comput Sci. 2004 Jul-Aug;44(4):1440-9. doi: 10.1021/ci0499469.

引用本文的文献

1
Band Gap and Reorganization Energy Prediction of Conducting Polymers by the Integration of Machine Learning and Density Functional Theory.通过机器学习与密度泛函理论相结合预测导电聚合物的带隙和重组能
J Chem Inf Model. 2025 Jun 9;65(11):5360-5369. doi: 10.1021/acs.jcim.5c00345. Epub 2025 May 28.
2
Advancing efficiency in deep-blue OLEDs: Exploring a machine learning-driven multiresonance TADF molecular design.提高深蓝色有机发光二极管的效率:探索机器学习驱动的多共振热激活延迟荧光分子设计。
Sci Adv. 2025 Jan 24;11(4):eadr1326. doi: 10.1126/sciadv.adr1326. Epub 2025 Jan 22.
3
Quantitative Nontarget Analysis of CECs in Environmental Samples Can Be Improved by Considering All Mass Adducts.
通过考虑所有质量加合物可改进环境样品中循环内皮细胞的定量非靶向分析。
Anal Chem. 2024 Jan 9;96(1):229-237. doi: 10.1021/acs.analchem.3c03791. Epub 2023 Dec 21.
4
Recent Advances in Machine-Learning-Based Chemoinformatics: A Comprehensive Review.基于机器学习的化学信息学的最新进展:全面综述。
Int J Mol Sci. 2023 Jul 15;24(14):11488. doi: 10.3390/ijms241411488.
5
Novel Thiosemicarbazone Quantum Dots in the Treatment of Alzheimer's Disease Combining In Silico Models Using Fingerprints and Physicochemical Descriptors.新型硫代氨基脲量子点结合使用指纹图谱和物理化学描述符的计算机模拟模型治疗阿尔茨海默病
ACS Omega. 2023 Mar 17;8(12):11076-11099. doi: 10.1021/acsomega.2c07934. eCollection 2023 Mar 28.
6
Design, Biological Evaluation, and Computer-Aided Analysis of Dihydrothiazepines as Selective Antichlamydial Agents.二氢噻嗪类化合物作为选择性抗衣原体药物的设计、生物学评价和计算机辅助分析。
J Med Chem. 2023 Feb 9;66(3):2116-2142. doi: 10.1021/acs.jmedchem.2c01894. Epub 2023 Jan 25.
7
Integration of fingerprint-based similarity searching and kernel-based partial least squares analysis to predict inhibitory activity against CSK, HER2, JAK1, JAK2, and JAK3.基于指纹的相似性搜索与基于核的偏最小二乘法分析相结合,预测对 CSK、HER2、JAK1、JAK2 和 JAK3 的抑制活性。
Mol Divers. 2024 Apr;28(2):497-507. doi: 10.1007/s11030-022-10596-1. Epub 2023 Jan 17.
8
Design of Organic Electronic Materials With a Goal-Directed Generative Model Powered by Deep Neural Networks and High-Throughput Molecular Simulations.基于深度神经网络和高通量分子模拟驱动的目标导向生成模型的有机电子材料设计
Front Chem. 2022 Jan 17;9:800370. doi: 10.3389/fchem.2021.800370. eCollection 2021.
9
Exploring the antidiabetic potential of compounds isolated from using computational aproach: ligand-based virtual screening.使用计算方法探索从……中分离出的化合物的抗糖尿病潜力:基于配体的虚拟筛选。 (原文中“from”后面缺少具体来源信息)
In Silico Pharmacol. 2021 Apr 3;9(1):25. doi: 10.1007/s40203-021-00084-z. eCollection 2021.
10
A machine learning approach to predict surgical learning curves.机器学习方法预测手术学习曲线。
Surgery. 2020 Feb;167(2):321-327. doi: 10.1016/j.surg.2019.10.008. Epub 2019 Nov 18.