Weaver Daniel S, Keseler Ingrid M, Mackie Amanda, Paulsen Ian T, Karp Peter D
Bioinformatics Research Group, SRI International, 333 Ravenswood Ave,, 94025 Menlo Park, CA, USA.
BMC Syst Biol. 2014 Jun 30;8:79. doi: 10.1186/1752-0509-8-79.
Constraint-based models of Escherichia coli metabolic flux have played a key role in computational studies of cellular metabolism at the genome scale. We sought to develop a next-generation constraint-based E. coli model that achieved improved phenotypic prediction accuracy while being frequently updated and easy to use. We also sought to compare model predictions with experimental data to highlight open questions in E. coli biology.
We present EcoCyc-18.0-GEM, a genome-scale model of the E. coli K-12 MG1655 metabolic network. The model is automatically generated from the current state of EcoCyc using the MetaFlux software, enabling the release of multiple model updates per year. EcoCyc-18.0-GEM encompasses 1445 genes, 2286 unique metabolic reactions, and 1453 unique metabolites. We demonstrate a three-part validation of the model that breaks new ground in breadth and accuracy: (i) Comparison of simulated growth in aerobic and anaerobic glucose culture with experimental results from chemostat culture and simulation results from the E. coli modeling literature. (ii) Essentiality prediction for the 1445 genes represented in the model, in which EcoCyc-18.0-GEM achieves an improved accuracy of 95.2% in predicting the growth phenotype of experimental gene knockouts. (iii) Nutrient utilization predictions under 431 different media conditions, for which the model achieves an overall accuracy of 80.7%. The model's derivation from EcoCyc enables query and visualization via the EcoCyc website, facilitating model reuse and validation by inspection. We present an extensive investigation of disagreements between EcoCyc-18.0-GEM predictions and experimental data to highlight areas of interest to E. coli modelers and experimentalists, including 70 incorrect predictions of gene essentiality on glucose, 80 incorrect predictions of gene essentiality on glycerol, and 83 incorrect predictions of nutrient utilization.
Significant advantages can be derived from the combination of model organism databases and flux balance modeling represented by MetaFlux. Interpretation of the EcoCyc database as a flux balance model results in a highly accurate metabolic model and provides a rigorous consistency check for information stored in the database.
基于约束的大肠杆菌代谢通量模型在基因组规模的细胞代谢计算研究中发挥了关键作用。我们试图开发一种下一代基于约束的大肠杆菌模型,该模型在实现频繁更新且易于使用的同时,能提高表型预测准确性。我们还试图将模型预测结果与实验数据进行比较,以突出大肠杆菌生物学中存在的未解决问题。
我们展示了EcoCyc - 18.0 - GEM,这是大肠杆菌K - 12 MG1655代谢网络的一个基因组规模模型。该模型是使用MetaFlux软件根据EcoCyc的当前状态自动生成的,每年能够发布多个模型更新版本。EcoCyc - 18.0 - GEM包含1445个基因、2286个独特的代谢反应和1453个独特的代谢物。我们展示了对该模型的三部分验证,在广度和准确性方面取得了新的突破:(i)将好氧和厌氧葡萄糖培养中的模拟生长与恒化器培养的实验结果以及大肠杆菌建模文献中的模拟结果进行比较。(ii)对模型中代表的1445个基因进行必需性预测,其中EcoCyc - 18.0 - GEM在预测实验性基因敲除的生长表型时,准确率提高到了95.2%。(iii)在431种不同培养基条件下进行营养物质利用预测,该模型的总体准确率达到了80.7%。该模型源自EcoCyc,能够通过EcoCyc网站进行查询和可视化,便于通过检查实现模型的重用和验证。我们对EcoCyc - 18.0 - GEM预测结果与实验数据之间的差异进行了广泛研究,以突出大肠杆菌建模人员和实验人员感兴趣的领域,包括70个关于葡萄糖的基因必需性错误预测、80个关于甘油的基因必需性错误预测以及83个营养物质利用错误预测。
模型生物数据库和由MetaFlux代表的通量平衡建模相结合可带来显著优势。将EcoCyc数据库解释为通量平衡模型可得到一个高度准确的代谢模型,并为数据库中存储的信息提供严格的一致性检查。