Linköping's University, Department of Biomedical engineering, Linköping, Sweden.
Cardiovascular Medicine Unit, Department of Medicine (Solna), Karolinska Institutet, Stockholm, Sweden.
PLoS Comput Biol. 2022 Apr 11;18(4):e1009999. doi: 10.1371/journal.pcbi.1009999. eCollection 2022 Apr.
Accurate measurements of metabolic fluxes in living cells are central to metabolism research and metabolic engineering. The gold standard method is model-based metabolic flux analysis (MFA), where fluxes are estimated indirectly from mass isotopomer data with the use of a mathematical model of the metabolic network. A critical step in MFA is model selection: choosing what compartments, metabolites, and reactions to include in the metabolic network model. Model selection is often done informally during the modelling process, based on the same data that is used for model fitting (estimation data). This can lead to either overly complex models (overfitting) or too simple ones (underfitting), in both cases resulting in poor flux estimates. Here, we propose a method for model selection based on independent validation data. We demonstrate in simulation studies that this method consistently chooses the correct model in a way that is independent on errors in measurement uncertainty. This independence is beneficial, since estimating the true magnitude of these errors can be difficult. In contrast, commonly used model selection methods based on the χ2-test choose different model structures depending on the believed measurement uncertainty; this can lead to errors in flux estimates, especially when the magnitude of the error is substantially off. We present a new approach for quantification of prediction uncertainty of mass isotopomer distributions in other labelling experiments, to check for problems with too much or too little novelty in the validation data. Finally, in an isotope tracing study on human mammary epithelial cells, the validation-based model selection method identified pyruvate carboxylase as a key model component. Our results argue that validation-based model selection should be an integral part of MFA model development.
准确测量活细胞中的代谢通量是代谢研究和代谢工程的核心。金标准方法是基于模型的代谢通量分析(MFA),其中通量是通过使用代谢网络的数学模型从质量同位素分馏数据间接估计的。MFA 的一个关键步骤是模型选择:选择要包含在代谢网络模型中的隔室、代谢物和反应。模型选择通常是在建模过程中根据用于模型拟合(估计数据)的相同数据进行非正式的,这可能导致模型过于复杂(过度拟合)或过于简单(欠拟合),在这两种情况下都会导致通量估计不佳。在这里,我们提出了一种基于独立验证数据的模型选择方法。我们在模拟研究中证明,该方法以独立于测量不确定度误差的方式一致地选择正确的模型。这种独立性是有益的,因为估计这些误差的真实大小可能很困难。相比之下,基于χ2检验的常用模型选择方法根据所认为的测量不确定度选择不同的模型结构;这可能导致通量估计中的错误,尤其是当误差幅度大大偏离时。我们提出了一种新方法,用于量化其他标记实验中质量同位素分布的预测不确定性,以检查验证数据中新颖性过多或过少的问题。最后,在对人乳腺上皮细胞的同位素示踪研究中,基于验证的模型选择方法确定了丙酮酸羧化酶是关键的模型组成部分。我们的结果表明,基于验证的模型选择应该是 MFA 模型开发的一个组成部分。