School of Medicine, University of Galway, Galway, Ireland.
Protein Sci. 2024 Oct;33(10):e5150. doi: 10.1002/pro.5150.
The integration of proteomics data with constraint-based reconstruction and analysis (COBRA) models plays a pivotal role in understanding the relationship between genotype and phenotype and bridges the gap between genome-level phenomena and functional adaptations. Integrating a generic genome-scale model with information on proteins enables generation of a context-specific metabolic model which improves the accuracy of model prediction. This review explores methodologies for incorporating proteomics data into genome-scale models. Available methods are grouped into four distinct categories based on their approach to integrate proteomics data and their depth of modeling. Within each category section various methods are introduced in chronological order of publication demonstrating the progress of this field. Furthermore, challenges and potential solutions to further progress are outlined, including the limited availability of appropriate in vitro data, experimental enzyme turnover rates, and the trade-off between model accuracy, computational tractability, and data scarcity. In conclusion, methods employing simpler approaches demand fewer kinetic and omics data, consequently leading to a less complex mathematical problem and reduced computational expenses. On the other hand, approaches that delve deeper into cellular mechanisms and aim to create detailed mathematical models necessitate more extensive kinetic and omics data, resulting in a more complex and computationally demanding problem. However, in some cases, this increased cost can be justified by the potential for more precise predictions.
蛋白质组学数据与基于约束的重建和分析(COBRA)模型的整合在理解基因型和表型之间的关系方面起着关键作用,它弥合了基因组水平现象与功能适应之间的差距。将通用的基因组尺度模型与关于蛋白质的信息相结合,可以生成特定于上下文的代谢模型,从而提高模型预测的准确性。本文探讨了将蛋白质组学数据纳入基因组尺度模型的方法。现有的方法根据其整合蛋白质组学数据的方法和建模的深度分为四个不同的类别。在每个类别部分,按照发表时间的顺序介绍了各种方法,展示了该领域的进展。此外,还概述了进一步发展的挑战和潜在解决方案,包括合适的体外数据、实验酶周转率的有限可用性,以及模型准确性、计算可处理性和数据稀缺性之间的权衡。总之,采用更简单方法的方法需要更少的动力学和组学数据,因此导致数学问题更简单,计算成本更低。另一方面,深入研究细胞机制并旨在创建详细数学模型的方法需要更广泛的动力学和组学数据,从而导致更复杂和计算要求更高的问题。然而,在某些情况下,通过更精确的预测,可以证明这种增加的成本是合理的。