Suppr超能文献

基于局部蛋白质亚结构的相互作用模型可推广至整个结构酶-配体空间。

Interaction model based on local protein substructures generalizes to the entire structural enzyme-ligand space.

作者信息

Strömbergsson Helena, Daniluk Pawel, Kryshtafovych Andriy, Fidelis Krzysztof, Wikberg Jarl E S, Kleywegt Gerard J, Hvidsten Torgeir R

机构信息

The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, Sweden, Department of Biophysics, Faculty of Physics, University of Warsaw, Warsaw, Poland.

出版信息

J Chem Inf Model. 2008 Nov;48(11):2278-88. doi: 10.1021/ci800200e.

Abstract

Chemogenomics is a new strategy in in silico drug discovery, where the ultimate goal is to understand molecular recognition for all molecules interacting with all proteins in the proteome. To study such cross interactions, methods that can generalize over proteins that vary greatly in sequence, structure, and function are needed. We present a general quantitative approach to protein-ligand binding affinity prediction that spans the entire structural enzyme-ligand space. The model was trained on a data set composed of all available enzymes cocrystallized with druglike ligands, taken from four publicly available interaction databases, for which a crystal structure is available. Each enzyme was characterized by a set of local descriptors of protein structure that describe the binding site of the cocrystallized ligand. The ligands in the training set were described by traditional QSAR descriptors. To evaluate the model, a comprehensive test set consisting of enzyme structures and ligands was manually curated. The test set contained enzyme-ligand complexes for which no crystal structures were available, and thus the binding modes were unknown. The test set enzymes were therefore characterized by matching their entire structures to the local descriptor library constructed from the training set. Both the training and the test set contained enzyme-ligand complexes from all major enzyme classes, and the enzymes spanned a large range of sequences and folds. The experimental binding affinities (p K i) ranged from 0.5 to 11.9 (0.7-11.0 in the test set). The induced model predicted the binding affinities of the external test set enzyme-ligand complexes with an r (2) of 0.53 and an RMSEP of 1.5. This demonstrates that the use of local descriptors makes it possible to create rough predictive models that can generalize over a wide range of protein targets.

摘要

化学基因组学是计算机辅助药物发现中的一种新策略,其最终目标是了解蛋白质组中与所有蛋白质相互作用的所有分子的分子识别机制。为了研究这种交叉相互作用,需要能够对序列、结构和功能差异很大的蛋白质进行泛化的方法。我们提出了一种通用的定量方法来预测蛋白质-配体结合亲和力,该方法涵盖了整个结构酶-配体空间。该模型是在一个数据集上进行训练的,该数据集由与类药物配体共结晶的所有可用酶组成,这些酶取自四个可公开获取的相互作用数据库,且这些数据库都有晶体结构。每种酶都由一组描述共结晶配体结合位点的蛋白质结构局部描述符来表征。训练集中的配体由传统的定量构效关系(QSAR)描述符来描述。为了评估该模型,我们手动策划了一个由酶结构和配体组成的综合测试集。该测试集包含没有晶体结构的酶-配体复合物,因此其结合模式未知。因此,通过将测试集酶的整个结构与从训练集构建的局部描述符库进行匹配来对其进行表征。训练集和测试集都包含来自所有主要酶类的酶-配体复合物,并且这些酶涵盖了广泛的序列和折叠类型。实验结合亲和力(pK i)范围为0.5至11.9(测试集中为0.7至11.0)。所构建的模型预测外部测试集酶-配体复合物的结合亲和力时,r(2)为0.53,均方根误差(RMSEP)为1.5。这表明使用局部描述符能够创建可以在广泛的蛋白质靶点上进行泛化的粗略预测模型。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验