Suppr超能文献

从蛋白质的一级序列预测其结合位点残基的进化方法。

Evolutionary approach to predicting the binding site residues of a protein from its primary sequence.

机构信息

Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA.

出版信息

Proc Natl Acad Sci U S A. 2011 Mar 29;108(13):5313-8. doi: 10.1073/pnas.1102210108. Epub 2011 Mar 14.

Abstract

Protein binding site residues, especially catalytic residues, play a central role in protein function. Because more than 99% of the ∼ 12 million protein sequences in the nonredundant protein database have no structural information, it is desirable to develop methods to predict the binding site residues of a protein from its primary sequence. This task is highly challenging, because the binding site residues constitute only a small portion of a protein. However, the binding site residues of a protein are clustered in its functional pocket(s), and their spatial patterns tend to be conserved in evolution. To take advantage of these evolutionary and structural principles, we constructed a database of ∼ 50,000 templates (called the pocket-containing segment database), each of which includes not only a sequence segment that contains a functional pocket but also the structural attributes of the pocket. To use this database, we designed a template-matching technique, termed residue-matching profiling, and established a criterion for selecting templates for a query sequence. Finally, we developed a probabilistic model for assigning spatial scores to matched residues between the template and query sequence in local alignments using a set of selected scoring matrices and for computing the binding likelihood of each matched residue in the query sequence. From the likelihoods, one can predict the binding site residues in the query sequence. An automated computational pipeline was developed for our method. A performance evaluation shows that our method achieves a 70% precision in predicting binding site residues at 60% sensitivity.

摘要

蛋白质结合位点残基,特别是催化残基,在蛋白质功能中起着核心作用。由于非冗余蛋白质数据库中约 1200 万种蛋白质序列中超过 99%没有结构信息,因此希望开发从蛋白质的一级序列预测其结合位点残基的方法。这项任务极具挑战性,因为结合位点残基只占蛋白质的一小部分。然而,蛋白质的结合位点残基在其功能口袋中聚集,并且它们的空间模式在进化中往往是保守的。为了利用这些进化和结构原则,我们构建了一个约 50000 个模板的数据库(称为含口袋片段数据库),每个模板不仅包含一个含有功能口袋的序列片段,还包含口袋的结构属性。为了使用这个数据库,我们设计了一种模板匹配技术,称为残基匹配分析,并建立了一个为查询序列选择模板的标准。最后,我们开发了一种基于一组选定评分矩阵的概率模型,用于在局部比对中为模板和查询序列之间的匹配残基分配空间得分,并计算查询序列中每个匹配残基的结合可能性。根据这些可能性,可以预测查询序列中的结合位点残基。我们的方法开发了一个自动化计算流程。性能评估表明,我们的方法在 60%的灵敏度下达到了 70%的预测结合位点残基的精度。

相似文献

引用本文的文献

7
PSC: protein surface classification.PSC:蛋白质表面分类。
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W435-9. doi: 10.1093/nar/gks495. Epub 2012 Jun 4.

本文引用的文献

8
Comparative protein structure modeling using MODELLER.使用MODELLER进行比较蛋白质结构建模。
Curr Protoc Protein Sci. 2007 Nov;Chapter 2:Unit 2.9. doi: 10.1002/0471140864.ps0209s50.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验