Suppr超能文献

使用序列-结构-功能范式从序列预测蛋白质功能的方法及其在谷氧还蛋白/硫氧还蛋白和T1核糖核酸酶中的应用。

Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases.

作者信息

Fetrow J S, Skolnick J

机构信息

Center for Biochemistry and Biophysics, University at Albany, SUNY, 1400 Washington Avenue, Albany, NY 12222, USA.

出版信息

J Mol Biol. 1998 Sep 4;281(5):949-68. doi: 10.1006/jmbi.1998.1993.

Abstract

The practical exploitation of the vast numbers of sequences in the genome sequence databases is crucially dependent on the ability to identify the function of each sequence. Unfortunately, current methods, including global sequence alignment and local sequence motif identification, are limited by the extent of sequence similarity between sequences of unknown and known function; these methods increasingly fail as the sequence identity diverges into and beyond the twilight zone of sequence identity. To address this problem, a novel method for identification of protein function based directly on the sequence-to-structure-to-function paradigm is described. Descriptors of protein active sites, termed "fuzzy functional forms" or FFFs, are created based on the geometry and conformation of the active site. By way of illustration, the active sites responsible for the disulfide oxidoreductase activity of the glutaredoxin/thioredoxin family and the RNA hydrolytic activity of the T1 ribonuclease family are presented. First, the FFFs are shown to correctly identify their corresponding active sites in a library of exact protein models produced by crystallography or NMR spectroscopy, most of which lack the specified activity. Next, these FFFs are used to screen for active sites in low-to-moderate resolution models produced by ab initio folding or threading prediction algorithms. Again, the FFFs can specifically identify the functional sites of these proteins from their predicted structures. The results demonstrate that low-to-moderate resolution models as produced by state-of-the-art tertiary structure prediction algorithms are sufficient to identify protein active sites. Prediction of a novel function for the gamma subunit of a yeast glycosyl transferase and prediction of the function of two hypothetical yeast proteins whose models were produced via threading are presented. This work suggests a means for the large-scale functional screening of genomic sequence databases based on the prediction of structure from sequence, then on the identification of functional active sites in the predicted structure.

摘要

基因组序列数据库中大量序列的实际应用,关键取决于识别每个序列功能的能力。不幸的是,当前的方法,包括全局序列比对和局部序列基序识别,都受到未知功能序列与已知功能序列之间序列相似性程度的限制;随着序列同一性进入并超越序列同一性的模糊区域,这些方法越来越失效。为了解决这个问题,本文描述了一种直接基于序列-结构-功能范式识别蛋白质功能的新方法。基于活性位点的几何形状和构象,创建了称为“模糊功能形式”(FFF)的蛋白质活性位点描述符。作为示例,展示了谷氧还蛋白/硫氧还蛋白家族的二硫键氧化还原酶活性以及T1核糖核酸酶家族的RNA水解活性所对应的活性位点。首先,在由晶体学或核磁共振光谱产生的精确蛋白质模型库中,FFF被证明能够正确识别其相应的活性位点,其中大多数模型缺乏特定活性。接下来,这些FFF用于筛选由从头折叠或穿线预测算法产生的低至中等分辨率模型中的活性位点。同样,FFF可以从预测结构中特异性识别这些蛋白质的功能位点。结果表明,由最先进的三级结构预测算法产生的低至中等分辨率模型足以识别蛋白质活性位点。本文还展示了对酵母糖基转移酶γ亚基新功能的预测,以及对通过穿线产生模型的两个假想酵母蛋白功能的预测。这项工作提出了一种基于从序列预测结构,然后在预测结构中识别功能活性位点,对基因组序列数据库进行大规模功能筛选的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验