Department of Chemical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai, 400076, India.
Photosynth Res. 2013 Nov;118(1-2):181-90. doi: 10.1007/s11120-013-9910-6. Epub 2013 Aug 24.
Genome scale metabolic model provides an overview of an organism's metabolic capability. These genome-specific metabolic reconstructions are based on identification of gene to protein to reaction (GPR) associations and, in turn, on homology with annotated genes from other organisms. Cyanobacteria are photosynthetic prokaryotes which have diverged appreciably from their nonphotosynthetic counterparts. They also show significant evolutionary divergence from plants, which are well studied for their photosynthetic apparatus. We argue that context-specific sequence and domain similarity can add to the repertoire of the GPR associations and significantly expand our view of the metabolic capability of cyanobacteria. We took an approach that combines the results of context-specific sequence-to-sequence similarity search with those of sequence-to-profile searches. We employ PSI-BLAST for the former, and CDD, Pfam, and COG for the latter. An optimization algorithm was devised to arrive at a weighting scheme to combine the different evidences with KEGG-annotated GPRs as training data. We present the algorithm in the form of software "Systematic, Homology-based Automated Re-annotation for Prokaryotes (SHARP)." We predicted 3,781 new GPR associations for the 10 prokaryotes considered of which eight are cyanobacteria species. These new GPR associations fall in several metabolic pathways and were used to annotate 7,718 gaps in the metabolic network. These new annotations led to discovery of several pathways that may be active and thereby providing new directions for metabolic engineering of these species for production of useful products. Metabolic model developed on such a reconstructed network is likely to give better phenotypic predictions.
基因组规模的代谢模型提供了生物体代谢能力的概述。这些基因组特异性的代谢重建是基于基因到蛋白质到反应(GPR)关联的识别,而反过来又基于与其他生物体注释基因的同源性。蓝细菌是光合原核生物,与非光合对应物有明显的分化。它们与植物的进化也有很大的不同,植物的光合作用装置研究得很好。我们认为,特定于上下文的序列和域相似性可以增加 GPR 关联的范围,并显著扩展我们对蓝细菌代谢能力的看法。我们采用了一种结合特定于上下文的序列到序列相似性搜索结果和序列到轮廓搜索结果的方法。我们使用 PSI-BLAST 进行前者,使用 CDD、Pfam 和 COG 进行后者。设计了一种优化算法来确定一种加权方案,将不同的证据与 KEGG 注释的 GPR 作为训练数据结合起来。我们以软件“基于系统和同源性的原核生物自动重新注释(SHARP)”的形式呈现该算法。我们为所考虑的 10 种原核生物中的 3781 种新的 GPR 关联进行了预测,其中 8 种是蓝细菌物种。这些新的 GPR 关联属于几种代谢途径,并用于注释代谢网络中的 7718 个缺口。这些新的注释发现了几个可能活跃的途径,从而为这些物种的代谢工程生产有用产品提供了新的方向。在这样重建的网络上开发的代谢模型很可能会给出更好的表型预测。