Christian Nils, May Patrick, Kempa Stefan, Handorf Thomas, Ebenhöh Oliver
Max-Planck-Institute for Molecular Plant Physiology, Potsdam-Golm, Germany.
Mol Biosyst. 2009 Dec;5(12):1889-903. doi: 10.1039/B915913b. Epub 2009 Sep 10.
Genome-scale metabolic networks which have been automatically derived through sequence comparison techniques are necessarily incomplete. We propose a strategy that incorporates genomic sequence data and metabolite profiles into modeling approaches to arrive at improved gene annotations and more complete genome-scale metabolic networks. The core of our strategy is an algorithm that computes minimal sets of reactions by which a draft network has to be extended in order to be consistent with experimental observations. A particular strength of our approach is that alternative possibilities are suggested and thus experimentally testable hypotheses are produced. We carefully evaluate our strategy on the well-studied metabolic network of Escherichia coli, demonstrating how the predictions can be improved by incorporating sequence data. Subsequently, we apply our method to the recently sequenced green alga Chlamydomonas reinhardtii. We suggest specific genes in the genome of Chlamydomonas which are the strongest candidates for coding the responsible enzymes.
通过序列比较技术自动推导出来的基因组规模代谢网络必然是不完整的。我们提出了一种策略,将基因组序列数据和代谢物谱整合到建模方法中,以获得改进的基因注释和更完整的基因组规模代谢网络。我们策略的核心是一种算法,该算法计算出一组最小的反应集,通过这些反应集,一个初步网络必须进行扩展,以便与实验观察结果一致。我们方法的一个特别优势在于,它会提出多种可能性,从而产生可通过实验检验的假设。我们在对大肠杆菌研究充分的代谢网络上仔细评估了我们的策略,展示了如何通过整合序列数据来改进预测。随后,我们将我们的方法应用于最近测序的绿藻莱茵衣藻。我们指出了莱茵衣藻基因组中最有可能编码相关酶的特定基因。