Lee Xing Ju, Drovandi Christopher C, Pettitt Anthony N
School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, 4000, Australia.
Biometrics. 2015 Mar;71(1):198-207. doi: 10.1111/biom.12249. Epub 2014 Oct 9.
Analytically or computationally intractable likelihood functions can arise in complex statistical inferential problems making them inaccessible to standard Bayesian inferential methods. Approximate Bayesian computation (ABC) methods address such inferential problems by replacing direct likelihood evaluations with repeated sampling from the model. ABC methods have been predominantly applied to parameter estimation problems and less to model choice problems due to the added difficulty of handling multiple model spaces. The ABC algorithm proposed here addresses model choice problems by extending Fearnhead and Prangle (2012, Journal of the Royal Statistical Society, Series B 74, 1-28) where the posterior mean of the model parameters estimated through regression formed the summary statistics used in the discrepancy measure. An additional stepwise multinomial logistic regression is performed on the model indicator variable in the regression step and the estimated model probabilities are incorporated into the set of summary statistics for model choice purposes. A reversible jump Markov chain Monte Carlo step is also included in the algorithm to increase model diversity for thorough exploration of the model space. This algorithm was applied to a validating example to demonstrate the robustness of the algorithm across a wide range of true model probabilities. Its subsequent use in three pathogen transmission examples of varying complexity illustrates the utility of the algorithm in inferring preference of particular transmission models for the pathogens.
在复杂的统计推断问题中,可能会出现分析或计算上难以处理的似然函数,这使得标准贝叶斯推断方法无法适用。近似贝叶斯计算(ABC)方法通过从模型中重复抽样来替代直接的似然评估,从而解决此类推断问题。由于处理多个模型空间的难度增加,ABC方法主要应用于参数估计问题,而较少应用于模型选择问题。本文提出的ABC算法通过扩展Fearnhead和Prangle(2012年,《皇家统计学会学报》,B辑74卷,1 - 28页)来解决模型选择问题,在该文献中,通过回归估计的模型参数的后验均值构成了差异度量中使用的汇总统计量。在回归步骤中,对模型指示变量执行额外的逐步多项逻辑回归,并将估计的模型概率纳入汇总统计量集合中以用于模型选择目的。该算法还包括一个可逆跳跃马尔可夫链蒙特卡罗步骤,以增加模型多样性,从而全面探索模型空间。此算法应用于一个验证示例,以证明该算法在广泛的真实模型概率范围内的稳健性。其随后在三个不同复杂程度的病原体传播示例中的应用,说明了该算法在推断病原体特定传播模型偏好方面的效用。