Datta Abhirup, Banerjee Sudipto, Finley Andrew O, Hamm Nicholas A S, Schaap Martijn
Johns Hopkins University.
University of California, Los Angeles.
Ann Appl Stat. 2016 Sep;10(3):1286-1316. doi: 10.1214/16-AOAS931. Epub 2016 Sep 28.
Particulate matter (PM) is a class of malicious environmental pollutants known to be detrimental to human health. Regulatory efforts aimed at curbing PM levels in different countries often require high resolution space-time maps that can identify red-flag regions exceeding statutory concentration limits. Continuous spatio-temporal Gaussian Process (GP) models can deliver maps depicting predicted PM levels and quantify predictive uncertainty. However, GP-based approaches are usually thwarted by computational challenges posed by large datasets. We construct a novel class of scalable Dynamic Nearest Neighbor Gaussian Process (DNNGP) models that can provide a sparse approximation to any spatio-temporal GP (e.g., with nonseparable covariance structures). The DNNGP we develop here can be used as a sparsity-inducing prior for spatio-temporal random effects in any Bayesian hierarchical model to deliver full posterior inference. Storage and memory requirements for a DNNGP model are linear in the size of the dataset, thereby delivering massive scalability without sacrificing inferential richness. Extensive numerical studies reveal that the DNNGP provides substantially superior approximations to the underlying process than low-rank approximations. Finally, we use the DNNGP to analyze a massive air quality dataset to substantially improve predictions of PM levels across Europe in conjunction with the LOTOS-EUROS chemistry transport models (CTMs).
颗粒物(PM)是一类已知对人体健康有害的恶意环境污染物。不同国家旨在控制PM水平的监管措施通常需要高分辨率的时空地图,以识别超过法定浓度限值的警示区域。连续时空高斯过程(GP)模型可以提供描绘预测PM水平的地图,并量化预测不确定性。然而,基于GP的方法通常会受到大型数据集带来的计算挑战的阻碍。我们构建了一类新颖的可扩展动态最近邻高斯过程(DNNGP)模型,该模型可以为任何时空GP(例如,具有不可分离协方差结构的GP)提供稀疏近似。我们在此开发的DNNGP可以用作任何贝叶斯层次模型中时空随机效应的稀疏诱导先验,以进行完整的后验推断。DNNGP模型的存储和内存需求与数据集大小呈线性关系,从而在不牺牲推断丰富性的情况下实现了大规模的可扩展性。大量的数值研究表明,与低秩近似相比,DNNGP为基础过程提供了实质上更优的近似。最后,我们使用DNNGP结合LOTOS-EUROS化学传输模型(CTM)分析了一个大规模空气质量数据集,以大幅改进对欧洲各地PM水平的预测。