Suppr超能文献

通过分区域上的网格化高斯过程实现的高度可扩展贝叶斯地理统计建模

Highly Scalable Bayesian Geostatistical Modeling via Meshed Gaussian Processes on Partitioned Domains.

作者信息

Peruzzi Michele, Banerjee Sudipto, Finley Andrew O

机构信息

Department of Forestry, Michigan State University.

Department of Statistical Science, Duke University.

出版信息

J Am Stat Assoc. 2022;117(538):969-982. doi: 10.1080/01621459.2020.1833889. Epub 2020 Nov 24.

Abstract

We introduce a class of scalable Bayesian hierarchical models for the analysis of massive geostatistical datasets. The underlying idea combines ideas on high-dimensional geostatistics by partitioning the spatial domain and modeling the regions in the partition using a sparsity-inducing directed acyclic graph (DAG). We extend the model over the DAG to a well-defined spatial process, which we call the Meshed Gaussian Process (MGP). A major contribution is the development of a MGPs on tessellated domains, accompanied by a Gibbs sampler for the efficient recovery of spatial random effects. In particular, the cubic MGP (Q-MGP) can harness high-performance computing resources by executing all large-scale operations in parallel within the Gibbs sampler, improving mixing and computing time compared to sequential updating schemes. Unlike some existing models for large spatial data, a Q-MGP facilitates massive caching of expensive matrix operations, making it particularly apt in dealing with spatiotemporal remote-sensing data. We compare Q-MGPs with large synthetic and real world data against state-of-the-art methods. We also illustrate using Normalized Difference Vegetation Index (NDVI) data from the Serengeti park region to recover latent multivariate spatiotemporal random effects at millions of locations. The source code is available at github.com/mkln/meshgp.

摘要

我们引入了一类可扩展的贝叶斯分层模型,用于分析海量地理统计数据集。其基本思想是通过划分空间域并使用稀疏诱导有向无环图(DAG)对划分中的区域进行建模,将高维地理统计学的思想结合起来。我们将DAG上的模型扩展为一个定义明确的空间过程,我们称之为网格化高斯过程(MGP)。一个主要贡献是在细分域上开发了MGP,并伴随着一个吉布斯采样器,用于有效恢复空间随机效应。特别是,立方MGP(Q-MGP)可以通过在吉布斯采样器中并行执行所有大规模操作来利用高性能计算资源,与顺序更新方案相比,改善了混合效果和计算时间。与一些现有的大空间数据模型不同,Q-MGP便于对昂贵的矩阵运算进行大规模缓存,使其特别适合处理时空遥感数据。我们将Q-MGP与大型合成数据和真实世界数据与最先进的方法进行比较。我们还展示了使用塞伦盖蒂公园地区的归一化植被指数(NDVI)数据来恢复数百万个位置的潜在多元时空随机效应。源代码可在github.com/mkln/meshgp上获取。

相似文献

3
High-Dimensional Bayesian Geostatistics.高维贝叶斯地质统计学
Bayesian Anal. 2017 Jun;12(2):583-614. doi: 10.1214/17-BA1056R. Epub 2017 May 16.
7
Bayesian Modeling and Analysis of Geostatistical Data.贝叶斯地理统计数据建模与分析
Annu Rev Stat Appl. 2017 Mar;4:245-266. doi: 10.1146/annurev-statistics-060116-054155. Epub 2016 Nov 28.
10
On nearest-neighbor Gaussian process models for massive spatial data.关于海量空间数据的最近邻高斯过程模型。
Wiley Interdiscip Rev Comput Stat. 2016 Sep-Oct;8(5):162-171. doi: 10.1002/wics.1383. Epub 2016 Aug 4.

引用本文的文献

8
Accelerating joint species distribution modelling with Hmsc-HPC by GPU porting.通过 GPU 移植加速 HMSC-HPC 联合物种分布模型
PLoS Comput Biol. 2024 Sep 3;20(9):e1011914. doi: 10.1371/journal.pcbi.1011914. eCollection 2024 Sep.

本文引用的文献

3
Efficient algorithms for Bayesian Nearest Neighbor Gaussian Processes.用于贝叶斯最近邻高斯过程的高效算法。
J Comput Graph Stat. 2019;28(2):401-414. doi: 10.1080/10618600.2018.1537924. Epub 2019 Apr 1.
4
A Case Study Competition Among Methods for Analyzing Large Spatial Data.大型空间数据分析方法的案例研究竞赛
J Agric Biol Environ Stat. 2019;24(3):398-425. doi: 10.1007/s13253-018-00348-w. Epub 2018 Dec 14.
8
High-Dimensional Bayesian Geostatistics.高维贝叶斯地质统计学
Bayesian Anal. 2017 Jun;12(2):583-614. doi: 10.1214/17-BA1056R. Epub 2017 May 16.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验