Systems Biology Research Group, Institute for Infectious Diseases and Infection Control (IIMK), Jena University Hospital, Kollegiengasse 10, 07743, Jena, Germany.
BMC Bioinformatics. 2022 Jun 10;23(1):226. doi: 10.1186/s12859-022-04742-7.
Elucidating cellular metabolism led to many breakthroughs in biotechnology, synthetic biology, and health sciences. To date, deriving metabolic fluxes by C tracer experiments is the most prominent approach for studying metabolic fluxes quantitatively, often with high accuracy and precision. However, the technique has a high demand for experimental resources. Alternatively, flux balance analysis (FBA) has been employed to estimate metabolic fluxes without labeling experiments. It is less informative but can benefit from the low costs and low experimental efforts and gain flux estimates in experimentally difficult conditions. Methods to integrate relevant experimental data have been emerged to improve FBA flux estimations. Data from transcription profiling is often selected since it is easy to generate at the genome scale, typically embedded by a discretization of differential and non-differential expressed genes coding for the respective enzymes.
We established the novel method Linear Programming based Gene Expression Model (LPM-GEM). LPM-GEM linearly embeds gene expression into FBA constraints. We implemented three strategies to reduce thermodynamically infeasible loops, which is a necessary prerequisite for such an omics-based model building. As a case study, we built a model of B. subtilis grown in eight different carbon sources. We obtained good flux predictions based on the respective transcription profiles when validating with C tracer based metabolic flux data of the same conditions. We could well predict the specific carbon sources. When testing the model on another, unseen dataset that was not used during training, good prediction performance was also observed. Furthermore, LPM-GEM outperformed a well-established model building methods.
Employing LPM-GEM integrates gene expression data efficiently. The method supports gene expression-based FBA models and can be applied as an alternative to estimate metabolic fluxes when tracer experiments are inappropriate.
阐明细胞代谢导致了生物技术、合成生物学和健康科学的许多突破。迄今为止,通过 C 示踪实验推导出代谢通量是研究代谢通量的最突出方法,通常具有很高的准确性和精度。然而,该技术对实验资源的要求很高。相反,通量平衡分析(FBA)已被用于在不进行标记实验的情况下估计代谢通量。它的信息量较少,但可以受益于低成本和低实验工作量,并在实验困难的条件下获得通量估计。已经出现了整合相关实验数据的方法来改进 FBA 通量估计。转录谱数据通常被选择,因为它易于在基因组规模上生成,通常通过离散化分别编码相应酶的差异表达和非差异表达基因来嵌入。
我们建立了一种新的方法——基于线性规划的基因表达模型(LPM-GEM)。LPM-GEM 将基因表达线性嵌入 FBA 约束中。我们实施了三种策略来减少热力学不可行的循环,这是基于组学构建模型的必要前提。作为案例研究,我们构建了一个在八种不同碳源中生长的 B. subtilis 模型。当使用相同条件下基于 C 示踪的代谢通量数据进行验证时,我们根据各自的转录谱获得了良好的通量预测。我们可以很好地预测特定的碳源。当在另一个未在训练过程中使用的、未见过的数据集上测试模型时,也观察到了良好的预测性能。此外,LPM-GEM 优于一种成熟的模型构建方法。
采用 LPM-GEM 可以有效地整合基因表达数据。该方法支持基于基因表达的 FBA 模型,并且当示踪实验不适用时,可以作为估计代谢通量的替代方法。