Brunk Elizabeth, Mih Nathan, Monk Jonathan, Zhang Zhen, O'Brien Edward J, Bliven Spencer E, Chen Ke, Chang Roger L, Bourne Philip E, Palsson Bernhard O
Department of Bioengineering, University of California, La Jolla, San Diego, CA, 92093, USA.
Joint BioEnergy Institute, Emeryville, CA, 94608, USA.
BMC Syst Biol. 2016 Mar 11;10:26. doi: 10.1186/s12918-016-0271-6.
The success of genome-scale models (GEMs) can be attributed to the high-quality, bottom-up reconstructions of metabolic, protein synthesis, and transcriptional regulatory networks on an organism-specific basis. Such reconstructions are biochemically, genetically, and genomically structured knowledge bases that can be converted into a mathematical format to enable a myriad of computational biological studies. In recent years, genome-scale reconstructions have been extended to include protein structural information, which has opened up new vistas in systems biology research and empowered applications in structural systems biology and systems pharmacology.
Here, we present the generation, application, and dissemination of genome-scale models with protein structures (GEM-PRO) for Escherichia coli and Thermotoga maritima. We show the utility of integrating molecular scale analyses with systems biology approaches by discussing several comparative analyses on the temperature dependence of growth, the distribution of protein fold families, substrate specificity, and characteristic features of whole cell proteomes. Finally, to aid in the grand challenge of big data to knowledge, we provide several explicit tutorials of how protein-related information can be linked to genome-scale models in a public GitHub repository ( https://github.com/SBRG/GEMPro/tree/master/GEMPro_recon/).
Translating genome-scale, protein-related information to structured data in the format of a GEM provides a direct mapping of gene to gene-product to protein structure to biochemical reaction to network states to phenotypic function. Integration of molecular-level details of individual proteins, such as their physical, chemical, and structural properties, further expands the description of biochemical network-level properties, and can ultimately influence how to model and predict whole cell phenotypes as well as perform comparative systems biology approaches to study differences between organisms. GEM-PRO offers insight into the physical embodiment of an organism's genotype, and its use in this comparative framework enables exploration of adaptive strategies for these organisms, opening the door to many new lines of research. With these provided tools, tutorials, and background, the reader will be in a position to run GEM-PRO for their own purposes.
基因组规模模型(GEMs)的成功归因于在特定生物体基础上对代谢、蛋白质合成和转录调控网络进行的高质量、自下而上的重建。此类重建是生物化学、遗传学和基因组学结构化的知识库,可转化为数学格式,以开展大量的计算生物学研究。近年来,基因组规模重建已扩展到包括蛋白质结构信息,这为系统生物学研究开辟了新视野,并推动了结构系统生物学和系统药理学中的应用。
在此,我们展示了用于大肠杆菌和海栖热袍菌的具有蛋白质结构的基因组规模模型(GEM-PRO)的生成、应用及传播。通过讨论关于生长的温度依赖性、蛋白质折叠家族分布、底物特异性以及全细胞蛋白质组特征的多项比较分析,我们展示了将分子尺度分析与系统生物学方法相结合的效用。最后,为助力大数据到知识这一重大挑战,我们在公共GitHub仓库(https://github.com/SBRG/GEMPro/tree/master/GEMPro_recon/)中提供了几个关于如何将蛋白质相关信息与基因组规模模型相链接的详细教程。
将基因组规模的蛋白质相关信息转化为GEM格式的结构化数据,提供了从基因到基因产物、到蛋白质结构、到生化反应、到网络状态再到表型功能的直接映射。整合单个蛋白质的分子水平细节,如它们的物理、化学和结构特性,进一步扩展了对生化网络水平特性的描述,并最终能够影响如何对全细胞表型进行建模和预测,以及如何采用比较系统生物学方法来研究生物体之间的差异。GEM-PRO有助于深入了解生物体基因型的物理体现,在这个比较框架中使用它能够探索这些生物体的适应性策略,为许多新的研究方向打开大门。借助所提供的这些工具、教程和背景知识,读者将能够出于自身目的运行GEM-PRO。