Raynal Louis, Chen Sixing, Mira Antonietta, Onnela Jukka-Pekka
Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, 655 Huntington Avenue, Building 2, 4th Floor, Boston, MA, USA 02115.
Data Science Lab, Institute of Computational Science, Università della Svizzera italiana, Via Buffi 6, 6900 Lugano, Switzerland.
Bayesian Anal. 2022 Mar;17(1):165-192. doi: 10.1214/20-ba1248. Epub 2020 Dec 8.
Approximate Bayesian computation (ABC) is a simulation-based likelihood-free method applicable to both model selection and parameter estimation. ABC parameter estimation requires the ability to forward simulate datasets from a candidate model, but because the sizes of the observed and simulated datasets usually need to match, this can be computationally expensive. Additionally, since ABC inference is based on comparisons of summary statistics computed on the observed and simulated data, using computationally expensive summary statistics can lead to further losses in efficiency. ABC has recently been applied to the family of mechanistic network models, an area that has traditionally lacked tools for inference and model choice. Mechanistic models of network growth repeatedly add nodes to a network until it reaches the size of the observed network, which may be of the order of millions of nodes. With ABC, this process can quickly become computationally prohibitive due to the resource intensive nature of network simulations and evaluation of summary statistics. We propose two methodological developments to enable the use of ABC for inference in models for large growing networks. First, to save time needed for forward simulating model realizations, we propose a procedure to extrapolate (via both least squares and Gaussian processes) summary statistics from small to large networks. Second, to reduce computation time for evaluating summary statistics, we use sample-based rather than census-based summary statistics. We show that the ABC posterior obtained through this approach, which adds two additional layers of approximation to the standard ABC, is similar to a classic ABC posterior. Although we deal with growing network models, both extrapolated summaries and sampled summaries are expected to be relevant in other ABC settings where the data are generated incrementally.
近似贝叶斯计算(ABC)是一种基于模拟的无似然方法,适用于模型选择和参数估计。ABC参数估计需要能够从候选模型正向模拟数据集,但由于观测数据集和模拟数据集的大小通常需要匹配,这在计算上可能很昂贵。此外,由于ABC推理基于对观测数据和模拟数据计算的汇总统计量的比较,使用计算成本高的汇总统计量可能会导致效率进一步损失。ABC最近已应用于机械网络模型家族,这一领域传统上缺乏推理和模型选择工具。网络增长的机械模型会反复向网络中添加节点,直到其达到观测网络的大小,观测网络的大小可能达到数百万个节点量级。使用ABC时,由于网络模拟和汇总统计量评估的资源密集性质,这个过程可能很快在计算上变得难以承受。我们提出了两种方法改进,以使ABC能够用于大型增长网络模型的推理。首先,为了节省正向模拟模型实现所需的时间,我们提出了一种程序(通过最小二乘法和高斯过程)从小网络到大网络外推汇总统计量。其次,为了减少评估汇总统计量的计算时间,我们使用基于样本而非普查的汇总统计量。我们表明,通过这种方法获得的ABC后验分布,在标准ABC基础上增加了两层额外的近似,与经典ABC后验分布相似。尽管我们处理的是增长网络模型,但外推汇总统计量和抽样汇总统计量预计在其他数据增量生成的ABC设置中也相关。