School of Mathematics and Statistics, University of Glasgow, Glasgow G12 8SQ, UK.
Grupo de Ecología Cuantitativa, INIBIOMA-CONICET, Universidad Nacional del Comahue, Bariloche, Argentina.
J R Soc Interface. 2023 Jan;20(198):20220676. doi: 10.1098/rsif.2022.0676. Epub 2023 Jan 4.
Inferring the underlying processes that drive collective behaviour in biological and social systems is a significant statistical and computational challenge. While simulation models have been successful in qualitatively capturing many of the phenomena observed in these systems in a variety of domains, formally fitting these models to data remains intractable. Recently, approximate Bayesian computation (ABC) has been shown to be an effective approach to inference if the likelihood function for a model is unavailable. However, a key difficulty in successfully implementing ABC lies with the design, selection and weighting of appropriate summary statistics, a challenge that is especially acute when modelling high dimensional complex systems. In this work, we combine a Gaussian process accelerated ABC method with the automatic learning of summary statistics via graph neural networks. Our approach bypasses the need to design a model-specific set of summary statistics for inference. Instead, we encode relational inductive biases into a neural network using a graph embedding and then extract summary statistics automatically from simulation data. To evaluate our framework, we use a model of collective animal movement as a test bed and compare our method to a standard summary statistics approach and a linear regression-based algorithm.
推断生物和社会系统中驱动集体行为的潜在过程是一个重大的统计和计算挑战。虽然模拟模型已经成功地在各种领域定性地捕捉到了这些系统中观察到的许多现象,但正式拟合这些模型到数据仍然是难以处理的。最近,近似贝叶斯计算(ABC)已被证明是一种有效的推断方法,如果模型的似然函数不可用。然而,在成功实施 ABC 方面的一个关键困难在于适当的摘要统计量的设计、选择和加权,当对高维复杂系统进行建模时,这一挑战尤其尖锐。在这项工作中,我们结合了一个高斯过程加速 ABC 方法和通过图神经网络自动学习摘要统计量。我们的方法绕过了为推断设计特定于模型的一组摘要统计量的需要。相反,我们使用图嵌入将关系归纳偏差编码到神经网络中,然后从模拟数据中自动提取摘要统计量。为了评估我们的框架,我们使用集体动物运动模型作为测试床,并将我们的方法与标准摘要统计方法和基于线性回归的算法进行比较。