Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge CB3 0WA, UK.
Department of Mathematics, Imperial College London, London SW7 2AZ, UK.
Proc Natl Acad Sci U S A. 2023 Feb 14;120(7):e2216415120. doi: 10.1073/pnas.2216415120. Epub 2023 Feb 10.
Computational models have become a powerful tool in the quantitative sciences to understand the behavior of complex systems that evolve in time. However, they often contain a potentially large number of free parameters whose values cannot be obtained from theory but need to be inferred from data. This is especially the case for models in the social sciences, economics, or computational epidemiology. Yet, many current parameter estimation methods are mathematically involved and computationally slow to run. In this paper, we present a computationally simple and fast method to retrieve accurate probability densities for model parameters using neural differential equations. We present a pipeline comprising multiagent models acting as forward solvers for systems of ordinary or stochastic differential equations and a neural network to then extract parameters from the data generated by the model. The two combined create a powerful tool that can quickly estimate densities on model parameters, even for very large systems. We demonstrate the method on synthetic time series data of the SIR model of the spread of infection and perform an in-depth analysis of the Harris-Wilson model of economic activity on a network, representing a nonconvex problem. For the latter, we apply our method both to synthetic data and to data of economic activity across Greater London. We find that our method calibrates the model orders of magnitude more accurately than a previous study of the same dataset using classical techniques, while running between 195 and 390 times faster.
计算模型已成为定量科学中理解随时间演变的复杂系统行为的有力工具。然而,它们通常包含大量的自由参数,这些参数的值无法从理论中获得,而需要从数据中推断出来。这在社会科学、经济学或计算流行病学中的模型中尤其如此。然而,许多当前的参数估计方法在数学上很复杂,运行速度也很慢。在本文中,我们提出了一种使用神经微分方程从模型参数中检索准确概率密度的计算上简单且快速的方法。我们提出了一个包含多智能体模型的流水线,这些模型作为常微分方程或随机微分方程系统的正向求解器,以及一个神经网络,然后从模型生成的数据中提取参数。这两者结合在一起形成了一个强大的工具,可以快速估计模型参数的密度,即使对于非常大的系统也是如此。我们在感染传播的 SIR 模型的合成时间序列数据上演示了该方法,并对代表非凸问题的经济活动的 Harris-Wilson 模型在网络上进行了深入分析。对于后者,我们将我们的方法应用于合成数据和大伦敦地区的经济活动数据。我们发现,与使用经典技术对同一数据集进行的先前研究相比,我们的方法在对模型阶数进行校准时的准确性要高得多,而运行速度则快 195 到 390 倍。