Department of Bioengineering, University of California, Berkeley, CA, USA.
Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
Mol Syst Biol. 2023 Dec 6;19(12):e11566. doi: 10.15252/msb.202311566. Epub 2023 Oct 27.
The Escherichia coli genome-scale metabolic model (GEM) is an exemplar systems biology model for the simulation of cellular metabolism. Experimental validation of model predictions is essential to pinpoint uncertainty and ensure continued development of accurate models. Here, we quantified the accuracy of four subsequent E. coli GEMs using published mutant fitness data across thousands of genes and 25 different carbon sources. This evaluation demonstrated the utility of the area under a precision-recall curve relative to alternative accuracy metrics. An analysis of errors in the latest (iML1515) model identified several vitamins/cofactors that are likely available to mutants despite being absent from the experimental growth medium and highlighted isoenzyme gene-protein-reaction mapping as a key source of inaccurate predictions. A machine learning approach further identified metabolic fluxes through hydrogen ion exchange and specific central metabolism branch points as important determinants of model accuracy. This work outlines improved practices for the assessment of GEM accuracy with high-throughput mutant fitness data and highlights promising areas for future model refinement in E. coli and beyond.
大肠杆菌基因组规模代谢模型(GEM)是用于模拟细胞代谢的系统生物学模型的典范。模型预测的实验验证对于确定不确定性并确保准确模型的持续发展至关重要。在这里,我们使用发表的数千个基因和 25 种不同碳源的突变体适合度数据,对随后的四个大肠杆菌 GEM 进行了准确性评估。与替代准确性指标相比,该评估证明了精确召回曲线下面积的有用性。对最新模型(iML1515)中误差的分析确定了一些维生素/辅因子,尽管它们不存在于实验生长培养基中,但突变体可能仍能获得这些维生素/辅因子,这突出了同工酶基因-蛋白质-反应映射是导致预测不准确的关键因素之一。机器学习方法进一步确定了通过氢离子交换和特定中心代谢分支点的代谢通量是模型准确性的重要决定因素。这项工作概述了使用高通量突变体适合度数据评估 GEM 准确性的改进实践,并强调了大肠杆菌及其它生物中未来模型改进的有前途的领域。