Kleiger Gary, Panina Ekaterina M, Mallick Parag, Eisenberg David
Howard Hughes Medical Institute, University of California, Los Angeles-Department Of Energy, Institute of Genomics and Proteomics, UCLA, Los Angeles, California 90095, USA.
Protein Sci. 2004 Jan;13(1):221-9. doi: 10.1110/ps.03274104.
The identification of the enzymes involved in the metabolism of simple and complex carbohydrates presents one bioinformatic challenge in the post-genomic era. Here, we present the PFIT and PFRIT algorithms for identifying those proteins adopting the alpha/beta barrel fold that function as glycosidases. These algorithms are based on the observation that proteins adopting the alpha/beta barrel fold share positions in their tertiary structures having equivalent sets of atomic interactions. These are conserved tertiary interaction positions, which have been implicated in both structure and function. Glycosidases adopting the alpha/beta barrel fold share more conserved tertiary interactions than alpha/beta barrel proteins having other functions. The enrichment pattern of conserved tertiary interactions in the glycosidases is the information that PFIT and PFRIT use to predict whether any given alpha/beta barrel will function as a glycosidase or not. Using as a test set a database of 19 glycosidase and 45 nonglycosidase alpha/beta barrel proteins with low sequence similarity, PFIT and PFRIT can correctly predict glycosidase function for 84% of the proteins known to function as glycosidases. PFIT and PFRIT incorrectly predict glycosidase function for 25% of the nonglycosidases. The program PSI-BLAST can also correctly identify 84% of the 19 glycosidases, however, it incorrectly predicts glycosidase function for 50% of the nonglycosidases (twofold greater than PFIT and PFRIT). Overall, we demonstrate that the structure-based PFIT and PFRIT algorithms are both more selective and sensitive for predicting glycosidase function than the sequence-based PSI-BLAST algorithm.
在后基因组时代,识别参与简单和复杂碳水化合物代谢的酶是一项生物信息学挑战。在此,我们提出了PFIT和PFRIT算法,用于识别那些采用α/β桶状折叠且具有糖苷酶功能的蛋白质。这些算法基于这样的观察结果:采用α/β桶状折叠的蛋白质在其三级结构中具有共享的原子相互作用等效集的位置。这些是保守的三级相互作用位置,它们与结构和功能都有关联。采用α/β桶状折叠的糖苷酶比具有其他功能的α/β桶状蛋白质共享更多保守的三级相互作用。糖苷酶中保守三级相互作用的富集模式是PFIT和PFRIT用于预测任何给定的α/β桶状结构是否会发挥糖苷酶功能的信息。以一个包含19种糖苷酶和45种低序列相似性的非糖苷酶α/β桶状蛋白质的数据库作为测试集,PFIT和PFRIT能够正确预测84%已知具有糖苷酶功能的蛋白质的糖苷酶功能。PFIT和PFRIT错误地预测了25%的非糖苷酶的糖苷酶功能。程序PSI-BLAST也能正确识别19种糖苷酶中的84%,然而,它错误地预测了50%的非糖苷酶的糖苷酶功能(比PFIT和PFRIT高出两倍)。总体而言,我们证明基于结构的PFIT和PFRIT算法在预测糖苷酶功能方面比基于序列的PSI-BLAST算法更具选择性和敏感性。