Roy Somtirtha, Radivojevic Tijana, Forrer Mark, Marti Jose Manuel, Jonnalagadda Vamshi, Backman Tyler, Morrell William, Plahar Hector, Kim Joonhoon, Hillson Nathan, Garcia Martin Hector
Lawrence Berkeley National Laboratory, Biological Systems and Engineering Division, Berkeley, CA, United States.
Department of Energy, Agile BioFoundry, Emeryville, CA, United States.
Front Bioeng Biotechnol. 2021 Feb 9;9:612893. doi: 10.3389/fbioe.2021.612893. eCollection 2021.
Biology has changed radically in the past two decades, growing from a purely descriptive science into also a design science. The availability of tools that enable the precise modification of cells, as well as the ability to collect large amounts of multimodal data, open the possibility of sophisticated bioengineering to produce fuels, specialty and commodity chemicals, materials, and other renewable bioproducts. However, despite new tools and exponentially increasing data volumes, synthetic biology cannot yet fulfill its true potential due to our inability to predict the behavior of biological systems. Here, we showcase a set of computational tools that, combined, provide the ability to store, visualize, and leverage multiomics data to predict the outcome of bioengineering efforts. We show how to upload, visualize, and output multiomics data, as well as strain information, into online repositories for several isoprenol-producing strain designs. We then use these data to train machine learning algorithms that recommend new strain designs that are correctly predicted to improve isoprenol production by 23%. This demonstration is done by using synthetic data, as provided by a novel library, that can produce credible multiomics data for testing algorithms and computational tools. In short, this paper provides a step-by-step tutorial to leverage these computational tools to improve production in bioengineered strains.
在过去二十年中,生物学发生了根本性的变化,从一门纯粹的描述性科学发展成为一门兼具设计性的科学。能够精确修饰细胞的工具的出现,以及收集大量多模态数据的能力,开启了利用复杂生物工程生产燃料、特种化学品和大宗商品化学品、材料及其他可再生生物产品的可能性。然而,尽管有了新工具且数据量呈指数级增长,但由于我们无法预测生物系统的行为,合成生物学尚未充分发挥其真正潜力。在此,我们展示了一组计算工具,这些工具结合起来能够存储、可视化并利用多组学数据来预测生物工程努力的结果。我们展示了如何将多组学数据以及菌株信息上传、可视化并输出到在线存储库中,用于几种生产异戊二烯醇的菌株设计。然后,我们利用这些数据训练机器学习算法,这些算法推荐的新菌株设计经正确预测可将异戊二烯醇产量提高23%。此演示是通过使用一个新型库提供的合成数据完成的,该库能够生成可靠的多组学数据用于测试算法和计算工具。简而言之,本文提供了一个逐步教程,介绍如何利用这些计算工具来提高生物工程菌株的产量。