Suppr超能文献

基于蛋白质静电、HINT和范德华力势的马尔可夫模型对酶的三维结构-功能关系的计算化学研究。

Computational chemistry study of 3D-structure-function relationships for enzymes based on Markov models for protein electrostatic, HINT, and van der Waals potentials.

作者信息

Concu Riccardo, Podda Gianni, Uriarte Eugenio, González-Díaz Humberto

机构信息

Unit of Bioinformatics and Connectivity Analysis (UBICA), Institute for Industrial Pharmacy, Faculty of Pharmacy, and Department of Organic Chemistry, University of Santiago de Compostela, Santiago de Compostela (USC), 15782, Spain.

出版信息

J Comput Chem. 2009 Jul 15;30(9):1510-20. doi: 10.1002/jcc.21170.

Abstract

In a significant work, Dobson and Doig (J Mol Biol 2003, 330, 771) illustrated protein prediction as enzymatic or not from spatial structure without resorting to alignments. They used 52 protein features and a nonlinear support vector machine model to classify more than 1000 proteins collected from the PDB with a 77% overall accuracy. The most useful features were: the secondary-structure content, the amino acid frequencies, the number of disulphide bonds, and the largest cleft size. Working on the same dataset used by D&D, in this article we reported a good and simple model, based on the Markov chain models (MCM), to classify protein 3D structures as enzymatic or not, taking into consideration the spatial structure without resorting to alignments. Here we define, for the first time, a general MCM to calculate the electrostatic potential, molecular vibrations, van der Waals (vdw) interactions, and hydrophobic interactions (HINT) and use them in comparative studies of potential fields and/or protein function prediction. The dataset is composed of 1371 proteins divided into 689 enzymes and 682 nonenzymes, all proteins were collected from the PDB. The best model we found was a linear model carried out with the linear discriminant analysis; it was able to classify 74.18% of the proteins using only two electrostatic potentials. In the work described here, we define 3D-HINT potentials (mu(k)) and use them for the first time to derive a classifier for protein enzymes. We analyzed ROC curves, domain of applicability, parametric assumptions, desirability maps, and also tested other nonlinear artificial neural network models which did not improve the linear model. In closing, this MCM allows a fast calculation and comparison of different potentials deriving into accurate protein 3D structure-function relationships, notably simpler than the previous.

摘要

在一项重要研究中,多布森和多伊格(《分子生物学杂志》2003年,第330卷,第771页)展示了如何在不借助序列比对的情况下,根据空间结构预测蛋白质是否具有酶活性。他们使用52种蛋白质特征和非线性支持向量机模型,对从蛋白质数据银行(PDB)收集的1000多种蛋白质进行分类,总体准确率达77%。最有用的特征包括:二级结构含量、氨基酸频率、二硫键数量以及最大裂隙尺寸。在多布森和多伊格使用的相同数据集基础上,本文报告了一个基于马尔可夫链模型(MCM)的简单有效模型,用于在不借助序列比对的情况下,根据空间结构对蛋白质三维结构是否具有酶活性进行分类。在此,我们首次定义了一种通用的马尔可夫链模型,用于计算静电势、分子振动、范德华(vdw)相互作用和疏水相互作用(HINT),并将其用于势场的比较研究和/或蛋白质功能预测。该数据集由1371种蛋白质组成,分为689种酶和682种非酶,所有蛋白质均从蛋白质数据银行收集。我们找到的最佳模型是一个通过线性判别分析实现的线性模型;仅使用两个静电势就能对74.18%的蛋白质进行分类。在本文所述研究中,我们定义了三维-HINT势(μ(k)),并首次将其用于推导蛋白质酶的分类器。我们分析了ROC曲线、适用范围、参数假设、合意性图,还测试了其他非线性人工神经网络模型,但这些模型并未改进线性模型。最后,这种马尔可夫链模型能够快速计算和比较不同的势,从而得出准确的蛋白质三维结构-功能关系,比之前的方法显著更简单。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验