IEEE/ACM Trans Comput Biol Bioinform. 2020 Jan-Feb;17(1):302-315. doi: 10.1109/TCBB.2018.2843339. Epub 2018 Jun 7.
Modeling and simulation techniques have demonstrated success in studying biological systems. As the drive to better capture biological complexity leads to more sophisticated simulators, it becomes challenging to perform statistical analyses that help translate predictions into increased understanding. These analyses may require repeated executions and extensive sampling of high-dimensional parameter spaces: analyses that may become intractable due to time and resource limitations. Significant reduction in these requirements can be obtained using surrogate models, or emulators, that can rapidly and accurately predict the output of an existing simulator. We apply emulation to evaluate and enrich understanding of a previously published agent-based simulator of lymphoid tissue organogenesis, showing an ensemble of machine learning techniques can reproduce results obtained using a suite of statistical analyses within seconds. This performance improvement permits incorporation of previously intractable analyses, including multi-objective optimization to obtain parameter sets that yield a desired response, and Approximate Bayesian Computation to assess parametric uncertainty. To facilitate exploitation of emulation in simulation-focused studies, we extend our open source statistical package, spartan, to provide a suite of tools for emulator development, validation, and application. Overcoming resource limitations permits enriched evaluation and refinement, easing translation of simulator insights into increased biological understanding.
建模和模拟技术已成功应用于研究生物系统。随着提高对生物复杂性的认识的需求,更复杂的模拟器不断涌现,这使得对有助于将预测转化为深入理解的统计分析变得具有挑战性。这些分析可能需要重复执行并对高维参数空间进行广泛采样:由于时间和资源的限制,这些分析可能变得难以处理。通过使用代理模型或仿真器,可以大大减少这些要求,代理模型或仿真器可以快速准确地预测现有模拟器的输出。我们应用仿真来评估和丰富对以前发表的基于代理的淋巴组织发生模拟器的理解,展示了一组机器学习技术可以在几秒钟内再现使用一整套统计分析获得的结果。这种性能的提高允许进行以前难以处理的分析,包括多目标优化以获得产生所需响应的参数集,以及近似贝叶斯计算以评估参数不确定性。为了促进在以模拟为重点的研究中利用仿真,我们扩展了我们的开源统计软件包 spartan,以提供一套用于仿真器开发、验证和应用的工具。克服资源限制可以实现更丰富的评估和改进,从而更容易将模拟器的见解转化为对生物学的深入理解。