Escuela de Ingeniería Bioquímica, Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile.
PLoS One. 2020 Dec 4;15(12):e0243067. doi: 10.1371/journal.pone.0243067. eCollection 2020.
Constraint-based models use steady-state mass balances to define a solution space of flux configurations, which can be narrowed down by measuring as many fluxes as possible. Due to loops and redundant pathways, this process typically yields multiple alternative solutions. To address this ambiguity, flux sampling can estimate the probability distribution of each flux, or a flux configuration can be singled out by further minimizing the sum of fluxes according to the assumption that cellular metabolism favors states where enzyme-related costs are economized. However, flux sampling is susceptible to artifacts introduced by thermodynamically infeasible cycles and is it not clear if the economy of fluxes assumption (EFA) is universally valid. Here, we formulated a constraint-based approach, MaxEnt, based on the principle of maximum entropy, which in this context states that if more than one flux configuration is consistent with a set of experimentally measured fluxes, then the one with the minimum amount of unwarranted assumptions corresponds to the best estimation of the non-observed fluxes. We compared MaxEnt predictions to Escherichia coli and Saccharomyces cerevisiae publicly available flux data. We found that the mean square error (MSE) between experimental and predicted fluxes by MaxEnt and EFA-based methods are three orders of magnitude lower than the median of 1,350,000 MSE values obtained using flux sampling. However, only MaxEnt and flux sampling correctly predicted flux through E. coli's glyoxylate cycle, whereas EFA-based methods, in general, predict no flux cycles. We also tested MaxEnt predictions at increasing levels of overflow metabolism. We found that MaxEnt accuracy is not affected by overflow metabolism levels, whereas the EFA-based methods show a decreasing performance. These results suggest that MaxEnt is less sensitive than flux sampling to artifacts introduced by thermodynamically infeasible cycles and that its predictions are less susceptible to overfitting than EFA-based methods.
基于约束的模型使用稳态质量平衡来定义通量配置的解空间,可以通过尽可能多地测量通量来缩小这个空间。由于存在循环和冗余途径,这个过程通常会产生多个替代解决方案。为了解决这个歧义,可以通过通量采样来估计每个通量的概率分布,或者根据细胞代谢倾向于节省与酶相关成本的状态的假设,通过进一步最小化通量的总和来单独选择通量配置。然而,通量采样容易受到热力学不可行循环引入的伪影的影响,并且不清楚通量节约假设(EFA)是否普遍适用。在这里,我们基于最大熵原理制定了一种基于约束的方法 MaxEnt,根据这个原理,如果有多个通量配置与一组实验测量的通量一致,那么与不必要假设数量最少的配置最符合未观察到的通量的最佳估计。我们将 MaxEnt 预测与公开的大肠杆菌和酿酒酵母通量数据进行了比较。我们发现,MaxEnt 和基于 EFA 的方法预测的实验和预测通量之间的均方误差(MSE)比使用通量采样获得的 1,350,000 个 MSE 值中位数低三个数量级。然而,只有 MaxEnt 和通量采样正确预测了大肠杆菌的乙醛酸循环中的通量,而基于 EFA 的方法通常预测没有通量循环。我们还在不断增加的溢出代谢水平上测试了 MaxEnt 预测。我们发现,MaxEnt 的准确性不受溢出代谢水平的影响,而基于 EFA 的方法的性能则下降。这些结果表明,与通量采样相比,MaxEnt 对热力学不可行循环引入的伪影的敏感性较低,并且其预测比基于 EFA 的方法更不易过度拟合。