Digital Technologies Research Centre, National Research Council of Canada, Ottawa, ON, Canada.
Department of Biochemistry, Microbiology, and Immunology, University of Ottawa, Ottawa, ON, Canada.
Methods Mol Biol. 2023;2553:417-439. doi: 10.1007/978-1-0716-2617-7_18.
Computational cell metabolism models seek to provide metabolic explanations of cell behavior under different conditions or following genetic alterations, help in the optimization of in vitro cell growth environments, or predict cellular behavior in vivo and in vitro. In the extremes, mechanistic models can include highly detailed descriptions of a small number of metabolic reactions or an approximate representation of an entire metabolic network. To date, all mechanistic models have required details of individual metabolic reactions, either kinetic parameters or metabolic flux, as well as information about extracellular and intracellular metabolite concentrations. Despite the extensive efforts and the increasing availability of high-quality data, required in vivo data are not available for the majority of known metabolic reactions; thus, mechanistic models are based primarily on ex vivo kinetic measurements and limited flux information. Machine learning approaches provide an alternative for derivation of functional dependencies from existing data. The increasing availability of metabolomic and lipidomic data, with growing feature coverage as well as sample set size, is expected to provide new data options needed for derivation of machine learning models of cell metabolic processes. Moreover, machine learning analysis of longitudinal data can lead to predictive models of cell behaviors over time. Conversely, machine learning models trained on steady-state data can provide descriptive models for the comparison of metabolic states in different environments or disease conditions. Additionally, inclusion of metabolic network knowledge in these analyses can further help in the development of models with limited data.This chapter will explore the application of machine learning to the modeling of cell metabolism. We first provide a theoretical explanation of several machine learning and hybrid mechanistic machine learning methods currently being explored to model metabolism. Next, we introduce several avenues for improving these models with machine learning. Finally, we provide protocols for specific examples of the utilization of machine learning in the development of predictive cell metabolism models using metabolomic data. We describe data preprocessing, approaches for training of machine learning models for both descriptive and predictive models, and the utilization of these models in synthetic and systems biology. Detailed protocols provide a list of software tools and libraries used for these applications, step-by-step modeling protocols, troubleshooting, as well as an overview of existing limitations to these approaches.
计算细胞代谢模型旨在为不同条件下或遗传改变后细胞行为提供代谢解释,帮助优化体外细胞生长环境,或预测体内和体外细胞行为。在极端情况下,机械模型可以包括对少数代谢反应的高度详细描述,或对整个代谢网络的近似表示。迄今为止,所有机械模型都需要单个代谢反应的详细信息,无论是动力学参数还是代谢通量,以及细胞外和细胞内代谢物浓度的信息。尽管已经进行了广泛的努力,并且越来越多的高质量数据可用,但大多数已知代谢反应都没有可用的体内数据;因此,机械模型主要基于体外动力学测量和有限的通量信息。机器学习方法为从现有数据推导出功能依赖关系提供了一种替代方法。代谢组学和脂质组学数据的可用性不断增加,特征覆盖率和样本集大小不断增加,预计将为推导细胞代谢过程的机器学习模型提供新的数据选择。此外,对纵向数据的机器学习分析可以导致随时间推移的细胞行为的预测模型。相反,基于稳态数据训练的机器学习模型可以为不同环境或疾病条件下的代谢状态比较提供描述性模型。此外,在这些分析中纳入代谢网络知识可以进一步帮助开发具有有限数据的模型。本章将探讨机器学习在细胞代谢建模中的应用。我们首先提供了目前正在探索的几种机器学习和混合机械机器学习方法的理论解释,以建模代谢。接下来,我们介绍了几种通过机器学习改进这些模型的途径。最后,我们提供了使用代谢组学数据开发预测细胞代谢模型的具体示例中机器学习的利用协议。我们描述了数据预处理、用于描述性和预测性模型的机器学习模型的训练方法,以及这些模型在合成和系统生物学中的利用。详细协议提供了用于这些应用的软件工具和库列表、逐步建模协议、故障排除以及对这些方法现有局限性的概述。