Kallberg Yvonne, Persson Bengt
IFM Bioinformatics, Linköping University, Sweden.
FEBS J. 2006 Mar;273(6):1177-84. doi: 10.1111/j.1742-4658.2006.05153.x.
Dehydrogenases and reductases are enzymes of fundamental metabolic importance that often adopt a specific structure known as the Rossmann fold. This fold, consisting of a six-stranded beta-sheet surrounded by alpha-helices, is responsible for coenzyme binding. We have developed a method to identify Rossmann folds and predict their coenzyme specificity (NAD, NADP or FAD) using only the amino acid sequence as input. The method is based upon hidden Markov models and sequence pattern analysis. The prediction sensitivity is 79% and the selectivity close to 100%. The method was applied on a set of 68 genomes, representing the three kingdoms archaea, bacteria and eukaryota. In prokaryotes, 3% of the genes were found to code for Rossmann-fold proteins, while the corresponding ratio in eukaryotes is only around 1%. In all genomes, NAD is the most preferred cofactor (41-49%), followed by NADP with 30-38%, while FAD is the least preferred cofactor (21%). However, the NAD preponderance over NADP is most pronounced in archaea, and least in eukaryotes. In all three kingdoms, only 3-8% of the Rossmann proteins are predicted to have more than one membrane-spanning segment, which is much lower than the frequency of membrane proteins in general. Analysis of the major protein types in eukaryotes reveals that the most common type (26%) of the Rossmann proteins are short-chain dehydrogenases/reductases. In addition, the identified Rossmann proteins were analyzed with respect to further protein types, enzyme classes and redundancy. The described method is available at http://www.ifm.liu.se/bioinfo, where the preferred coenzyme and its binding region are predicted given an amino acid sequence as input.
脱氢酶和还原酶是具有重要基础代谢意义的酶,它们通常具有一种特定的结构,即罗斯曼折叠。这种折叠结构由一个被α螺旋包围的六链β折叠片层组成,负责辅酶结合。我们开发了一种仅使用氨基酸序列作为输入来识别罗斯曼折叠并预测其辅酶特异性(NAD、NADP或FAD)的方法。该方法基于隐马尔可夫模型和序列模式分析。预测灵敏度为79%,选择性接近100%。该方法应用于一组代表古菌、细菌和真核生物三个界的68个基因组。在原核生物中,发现3%的基因编码罗斯曼折叠蛋白,而在真核生物中的相应比例仅约为1%。在所有基因组中,NAD是最常用的辅因子(41 - 49%),其次是NADP,占30 - 38%,而FAD是最不常用的辅因子(21%)。然而,NAD相对于NADP的优势在古菌中最为明显,在真核生物中最不明显。在所有三个界中,只有3 - 8%的罗斯曼蛋白预计具有多个跨膜区段,这远低于一般膜蛋白的频率。对真核生物中主要蛋白质类型的分析表明,罗斯曼蛋白最常见的类型(26%)是短链脱氢酶/还原酶。此外,还对鉴定出的罗斯曼蛋白在进一步的蛋白质类型、酶类和冗余性方面进行了分析。所述方法可在http://www.ifm.liu.se/bioinfo获取,在该网站上,给定氨基酸序列作为输入时可预测其首选辅酶及其结合区域。