Bliznyuk Nikolay, Ruppert David, Shoemaker Christine A
Department of Statistics, Texas A&M University, College Station, TX 77843 (
School of Operations Research and Information Engineering, Cornell University, Comstock Hall, Ithaca, NY 14853 (
J Comput Graph Stat. 2011;20(3):636-655. doi: 10.1198/jcgs.2011.09212. Epub 2012 Jan 24.
Markov chain Monte Carlo (MCMC) is nowadays a standard approach to numerical computation of integrals of the posterior density of the parameter vector . Unfortunately, Bayesian inference using MCMC is computationally intractable when the posterior density is expensive to evaluate. In many such problems, it is possible to identify a minimal subvector of responsible for the expensive computation in the evaluation of . We propose two approaches, DOSKA and INDA, that approximate by interpolation in ways that exploit this computational structure to mitigate the curse of dimensionality. DOSKA interpolates directly while INDA interpolates indirectly by interpolating functions, for example, a regression function, upon which depends. Our primary contribution is derivation of a Gaussian processes interpolant that provably improves over some of the existing approaches by reducing the effective dimension of the interpolation problem from dim() to dim(). This allows a dramatic reduction of the number of expensive evaluations necessary to construct an accurate approximation of when dim() is high but dim() is low. We illustrate the proposed approaches in a case study for a spatio-temporal linear model for air pollution data in the greater Boston area. Supplemental materials include proofs, details, and software implementation of the proposed procedures.
马尔可夫链蒙特卡罗(MCMC)如今是对参数向量后验密度积分进行数值计算的标准方法。不幸的是,当后验密度评估成本高昂时,使用MCMC的贝叶斯推断在计算上难以处理。在许多此类问题中,有可能识别出参数向量的一个最小子向量,它在评估过程中导致计算成本高昂。我们提出了两种方法,即DOSKA和INDA,它们通过利用这种计算结构的插值方式来近似后验密度,以减轻维度灾难。DOSKA直接对后验密度进行插值,而INDA通过对后验密度所依赖的函数(例如回归函数)进行插值来间接插值后验密度。我们的主要贡献是推导了一种高斯过程插值器,通过将插值问题的有效维度从参数向量的维度降至子向量的维度,可证明该插值器优于一些现有方法。当参数向量维度高但子向量维度低时,这使得构建后验密度的精确近似所需的昂贵评估次数大幅减少。我们在一个针对大波士顿地区空气污染数据的时空线性模型的案例研究中展示了所提出的方法。补充材料包括所提程序的证明、细节和软件实现。