Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.
Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.
PLoS Comput Biol. 2023 Sep 29;19(9):e1011436. doi: 10.1371/journal.pcbi.1011436. eCollection 2023 Sep.
Microbiomes interact dynamically with their environment to perform exploitable functions such as production of valuable metabolites and degradation of toxic metabolites for a wide range of applications in human health, agriculture, and environmental cleanup. Developing computational models to predict the key bacterial species and environmental factors to build and optimize such functions are crucial to accelerate microbial community engineering. However, there is an unknown web of interactions that determine the highly complex and dynamic behavior of these systems, which precludes the development of models based on known mechanisms. By contrast, entirely data-driven machine learning models can produce physically unrealistic predictions and often require significant amounts of experimental data to learn system behavior. We develop a physically-constrained recurrent neural network that preserves model flexibility but is constrained to produce physically consistent predictions and show that it can outperform existing machine learning methods in the prediction of certain experimentally measured species abundance and metabolite concentrations. Further, we present a closed-loop, Bayesian experimental design algorithm to guide data collection by selecting experimental conditions that simultaneously maximize information gain and target microbial community functions. Using a bioreactor case study, we demonstrate how the proposed framework can be used to efficiently navigate a large design space to identify optimal operating conditions. The proposed methodology offers a flexible machine learning approach specifically tailored to optimize microbiome target functions through the sequential design of informative experiments that seek to explore and exploit community functions.
微生物组与环境动态相互作用,可执行多种可利用的功能,如产生有价值的代谢物和降解有毒代谢物,这在人类健康、农业和环境清理等领域有广泛的应用。开发用于预测关键细菌物种和环境因素的计算模型,以构建和优化这些功能,对于加速微生物群落工程至关重要。然而,存在一个未知的相互作用网络,决定了这些系统高度复杂和动态的行为,这使得基于已知机制的模型开发变得不可能。相比之下,完全基于数据的机器学习模型可以产生不符合物理现实的预测,并且通常需要大量的实验数据来学习系统行为。我们开发了一种物理约束的递归神经网络,它保留了模型的灵活性,但被约束为产生物理一致的预测,并表明它可以在某些实验测量的物种丰度和代谢物浓度的预测中优于现有的机器学习方法。此外,我们提出了一种闭环、贝叶斯实验设计算法,通过选择同时最大化信息增益和目标微生物群落功能的实验条件来指导数据收集。使用生物反应器案例研究,我们展示了如何使用所提出的框架来有效地遍历大型设计空间,以确定最佳操作条件。该方法提供了一种灵活的机器学习方法,特别适合通过有针对性的实验设计来优化微生物组的目标功能,这些实验旨在探索和利用群落功能。