• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用三维结合位点核提高化学生物基因组模型的准确性。

Enhancing the accuracy of chemogenomic models with a three-dimensional binding site kernel.

机构信息

Structural Chemogenomics, Laboratory of Therapeutical Innovation, UMR 7200 CNRS, University of Strasbourg, F-67400 Illkirch, France.

出版信息

J Chem Inf Model. 2011 Jul 25;51(7):1593-603. doi: 10.1021/ci200166t. Epub 2011 Jun 21.

DOI:10.1021/ci200166t
PMID:21644501
Abstract

Computational chemogenomic (or proteochemometric) methods predict target-ligand interactions by training machine learning algorithms on known experimental data in order to distinguish attributes of true from false target-ligand pairs. Many ligand and target descriptors can be used for training and predicting binary associations or even binding affinities. Several chemogenomic studies have not noticed any real benefit in using 3-D structural target descriptors with respect to simpler sequence-based or property-based information. To assess whether this observation results from inaccurate target description or from the fact that 3-D information is simply not required in chemogenomic modeling, we used a target kernel measuring the distance between target-ligand binding sites of known X-ray structures. When used in combination with a standard ligand kernel in a support vector machine (SVM) classifier, the 3-D target kernel significantly outperforms a sequence-based target kernel in discriminating 2882 target-ligand PDB complexes from 9128 false pairs, whatever the modeling procedure (local or global). The best SVM models could be successfully applied to predict, with very high recall (70%), precision (99%), and specificity (99%), target-ligand associations for an external set of 14,117 ligands and 531 targets. In most of the cases, pooling all data in a global model gave better statistics than just discretizing specific target-ligand subspaces in local models. The current study clearly demonstrates that chemogenomic models taking both ligand and target information outperform simpler ligand-based models. It also permits one to design good modeling practices in predicting target-ligand pairing for a large array of targets: (i) ligand-based models are precise enough if sufficient ligand information (>40-50 diverse ligands) is known; (ii) if not, structure-based chemogenomic models (associating a ligand kernel to a structure-based target kernel) are recommended for proteins of known holostructures; (iii) sequence-based chemogenomic models (associating a ligand kernel to a sequence-based target kernel) can still be used with a very good accuracy for the remaining targets.

摘要

计算化学生物基因组学(或蛋白质化学计量学)方法通过在已知的实验数据上训练机器学习算法来预测靶标-配体相互作用,以区分真实靶标-配体对和虚假靶标-配体对的属性。许多配体和靶标描述符可用于训练和预测二元关联,甚至结合亲和力。一些化学生物基因组学研究没有注意到在使用三维结构靶标描述符方面相对于更简单的基于序列或基于性质的信息有任何实际好处。为了评估这种观察结果是由于靶标描述不准确还是由于在化学生物基因组学建模中根本不需要三维信息,我们使用了一种靶标核函数来测量已知 X 射线结构的靶标-配体结合位点之间的距离。当与支持向量机(SVM)分类器中的标准配体核函数结合使用时,三维靶标核函数在区分 2882 个靶标-配体 PDB 复合物与 9128 个假对时,无论建模过程(局部或全局)如何,都显著优于基于序列的靶标核函数。最佳的 SVM 模型可以成功应用于预测,其外部 14117 个配体和 531 个靶标集的靶标-配体关联,召回率(70%)、精度(99%)和特异性(99%)非常高。在大多数情况下,与在局部模型中仅离散特定靶标-配体子空间相比,在全局模型中汇总所有数据可提供更好的统计信息。本研究清楚地表明,同时考虑配体和靶标信息的化学生物基因组学模型优于更简单的基于配体的模型。它还允许设计用于预测大量靶标靶标-配体配对的良好建模实践:(i)如果有足够的配体信息(>40-50 种不同的配体),则基于配体的模型足够精确;(ii)如果没有,则建议使用基于结构的化学生物基因组模型(将配体核函数与基于结构的靶标核函数相关联)用于具有已知整体结构的蛋白质;(iii)对于其余的靶标,仍然可以使用基于序列的化学生物基因组模型(将配体核函数与基于序列的靶标核函数相关联)以非常高的准确性进行使用。

相似文献

1
Enhancing the accuracy of chemogenomic models with a three-dimensional binding site kernel.利用三维结合位点核提高化学生物基因组模型的准确性。
J Chem Inf Model. 2011 Jul 25;51(7):1593-603. doi: 10.1021/ci200166t. Epub 2011 Jun 21.
2
Development and validation of a novel protein-ligand fingerprint to mine chemogenomic space: application to G protein-coupled receptors and their ligands.一种用于挖掘化学基因组空间的新型蛋白质-配体指纹图谱的开发与验证:应用于G蛋白偶联受体及其配体
J Chem Inf Model. 2009 Apr;49(4):1049-62. doi: 10.1021/ci800447g.
3
Ligand prediction from protein sequence and small molecule information using support vector machines and fingerprint descriptors.利用支持向量机和指纹描述符从蛋白质序列和小分子信息进行配体预测。
J Chem Inf Model. 2009 Apr;49(4):767-79. doi: 10.1021/ci900004a.
4
Ligand prediction for orphan targets using support vector machines and various target-ligand kernels is dominated by nearest neighbor effects.使用支持向量机和各种靶标-配体核来预测孤儿靶标的配体,主要受最近邻效应的影响。
J Chem Inf Model. 2009 Oct;49(10):2155-67. doi: 10.1021/ci9002624.
5
Generalized modeling of enzyme-ligand interactions using proteochemometrics and local protein substructures.使用蛋白质化学计量学和局部蛋白质亚结构对酶-配体相互作用进行广义建模。
Proteins. 2006 Nov 15;65(3):568-79. doi: 10.1002/prot.21163.
6
Domain-based small molecule binding site annotation.基于结构域的小分子结合位点注释。
BMC Bioinformatics. 2006 Mar 17;7:152. doi: 10.1186/1471-2105-7-152.
7
Rough set-based proteochemometrics modeling of G-protein-coupled receptor-ligand interactions.基于粗糙集的G蛋白偶联受体-配体相互作用的蛋白质化学计量学建模
Proteins. 2006 Apr 1;63(1):24-34. doi: 10.1002/prot.20777.
8
Predicting protein-ligand binding affinities using novel geometrical descriptors and machine-learning methods.使用新型几何描述符和机器学习方法预测蛋白质-配体结合亲和力。
J Chem Inf Comput Sci. 2004 Mar-Apr;44(2):699-703. doi: 10.1021/ci034246+.
9
Characterization of domain-peptide interaction interface: a case study on the amphiphysin-1 SH3 domain.结构域-肽相互作用界面的表征:以发动蛋白-1 SH3结构域为例的研究
J Mol Biol. 2008 Feb 29;376(4):1201-14. doi: 10.1016/j.jmb.2007.12.054. Epub 2008 Jan 3.
10
Kernel methods for predicting protein-protein interactions.用于预测蛋白质-蛋白质相互作用的核方法。
Bioinformatics. 2005 Jun;21 Suppl 1:i38-46. doi: 10.1093/bioinformatics/bti1016.

引用本文的文献

1
Drug Target Identification with Machine Learning: How to Choose Negative Examples.基于机器学习的药物靶点识别:如何选择负例。
Int J Mol Sci. 2021 May 12;22(10):5118. doi: 10.3390/ijms22105118.
2
Open-source chemogenomic data-driven algorithms for predicting drug-target interactions.开源化学生物基因组数据驱动算法,用于预测药物-靶标相互作用。
Brief Bioinform. 2019 Jul 19;20(4):1465-1474. doi: 10.1093/bib/bby010.
3
Large-Scale Prediction of Drug-Target Interaction: a Data-Centric Review.大规模药物-靶标相互作用预测:以数据为中心的综述。
AAPS J. 2017 Sep;19(5):1264-1275. doi: 10.1208/s12248-017-0092-6. Epub 2017 Jun 2.
4
Insights into an original pocket-ligand pair classification: a promising tool for ligand profile prediction.深入了解原始口袋配体对分类:一种用于预测配体特征的有前途的工具。
PLoS One. 2013 Jun 20;8(6):e63730. doi: 10.1371/journal.pone.0063730. Print 2013.
5
Comparison of ultra-fast 2D and 3D ligand and target descriptors for side effect prediction and network analysis in polypharmacology.用于多药理学中副作用预测和网络分析的超快速二维和三维配体及靶点描述符的比较
Br J Pharmacol. 2013 Oct;170(3):557-67. doi: 10.1111/bph.12294.
6
Network pharmacology strategies toward multi-target anticancer therapies: from computational models to experimental design principles.面向多靶点抗癌治疗的网络药理学策略:从计算模型到实验设计原则
Curr Pharm Des. 2014;20(1):23-36. doi: 10.2174/13816128113199990470.
7
Which compound to select in lead optimization? Prospectively validated proteochemometric models guide preclinical development.在先导优化中应选择哪种化合物?前瞻性验证的蛋白组化学计量模型指导临床前开发。
PLoS One. 2011;6(11):e27518. doi: 10.1371/journal.pone.0027518. Epub 2011 Nov 23.
8
STITCH 3: zooming in on protein-chemical interactions.STITCH 3:深入研究蛋白质-化学相互作用。
Nucleic Acids Res. 2012 Jan;40(Database issue):D876-80. doi: 10.1093/nar/gkr1011. Epub 2011 Nov 9.