Biomedical Engineering, Boston University, Boston, Massachusetts 02215, United States.
Biological Design Center, Boston University, Boston, Massachusetts 02215, United States.
ACS Synth Biol. 2023 Aug 18;12(8):2367-2381. doi: 10.1021/acssynbio.3c00203. Epub 2023 Jul 19.
Engineering biology relies on the accurate prediction of cell responses. However, making these predictions is challenging for a variety of reasons, including the stochasticity of biochemical reactions, variability between cells, and incomplete information about underlying biological processes. Machine learning methods, which can model diverse input-output relationships without requiring mechanistic knowledge, are an ideal tool for this task. For example, such approaches can be used to predict gene expression dynamics given time-series data of past expression history. To explore this application, we computationally simulated single-cell responses, incorporating different sources of noise and alternative genetic circuit designs. We showed that deep neural networks trained on these simulated data were able to correctly infer the underlying dynamics of a cell response even in the presence of measurement noise and stochasticity in the biochemical reactions. The training set size and the amount of past data provided as inputs both affected prediction quality, with cascaded genetic circuits that introduce delays requiring more past data. We also tested prediction performance on a bistable auto-activation circuit, finding that our initial method for predicting a single trajectory was fundamentally ill-suited for multimodal dynamics. To address this, we updated the network architecture to predict the entire distribution of future states, showing it could accurately predict bimodal expression distributions. Overall, these methods can be readily applied to the diverse prediction tasks necessary to predict and control a variety of biological circuits, a key aspect of many synthetic biology applications.
工程生物学依赖于对细胞反应的准确预测。然而,由于各种原因,包括生化反应的随机性、细胞间的可变性以及对基础生物过程的信息不完全,使得做出这些预测具有挑战性。机器学习方法可以在不需要机械知识的情况下对各种输入-输出关系进行建模,是完成这项任务的理想工具。例如,这些方法可以用于根据过去表达历史的时间序列数据预测基因表达动态。为了探索这一应用,我们通过计算机模拟了单细胞反应,纳入了不同的噪声源和替代的遗传电路设计。我们表明,即使在存在测量噪声和生化反应随机性的情况下,经过这些模拟数据训练的深度神经网络也能够正确推断细胞反应的潜在动态。训练集的大小和作为输入提供的过去数据量都会影响预测质量,具有引入延迟的级联遗传电路需要更多的过去数据。我们还在双稳态自动激活电路上测试了预测性能,发现我们最初用于预测单个轨迹的方法在处理多峰动力学方面根本不适用。为了解决这个问题,我们更新了网络架构以预测未来状态的整个分布,表明它可以准确预测双峰表达分布。总的来说,这些方法可以很容易地应用于各种必要的预测任务,以预测和控制各种生物电路,这是许多合成生物学应用的关键方面。