Suppr超能文献

关于海量空间数据的最近邻高斯过程模型。

On nearest-neighbor Gaussian process models for massive spatial data.

作者信息

Datta Abhirup, Banerjee Sudipto, Finley Andrew O, Gelfand Alan E

机构信息

Department of Biostatistics, University of Minnesota, Minneapolis, MN, USA.

Department of Biostatistics, University of California, Los Angeles, CA, USA.

出版信息

Wiley Interdiscip Rev Comput Stat. 2016 Sep-Oct;8(5):162-171. doi: 10.1002/wics.1383. Epub 2016 Aug 4.

Abstract

Gaussian Process (GP) models provide a very flexible nonparametric approach to modeling location-and-time indexed datasets. However, the storage and computational requirements for GP models are infeasible for large spatial datasets. Nearest Neighbor Gaussian Processes (Datta A, Banerjee S, Finley AO, Gelfand AE. Hierarchical nearest-neighbor gaussian process models for large geostatistical datasets. 2016., JASA) provide a scalable alternative by using local information from few nearest neighbors. Scalability is achieved by using the neighbor sets in a conditional specification of the model. We show how this is equivalent to sparse modeling of Cholesky factors of large covariance matrices. We also discuss a general approach to construct scalable Gaussian Processes using sparse local kriging. We present a multivariate data analysis which demonstrates how the nearest neighbor approach yields inference indistinguishable from the full rank GP despite being several times faster. Finally, we also propose a variant of the NNGP model for automating the selection of the neighbor set size.

摘要

高斯过程(GP)模型为对位置和时间索引数据集进行建模提供了一种非常灵活的非参数方法。然而,对于大型空间数据集而言,GP模型的存储和计算要求是不可行的。最近邻高斯过程(达塔A、班纳吉S、芬利AO、格尔芬德AE。用于大地统计数据集的分层最近邻高斯过程模型。2016年,《美国统计协会杂志》)通过使用来自少数最近邻的局部信息提供了一种可扩展的替代方法。可扩展性是通过在模型的条件规范中使用邻域集来实现的。我们展示了这如何等同于对大型协方差矩阵的乔列斯基因子进行稀疏建模。我们还讨论了一种使用稀疏局部克里金法构建可扩展高斯过程的通用方法。我们进行了一项多变量数据分析,展示了最近邻方法如何尽管速度快了几倍,但仍能产生与满秩GP难以区分的推断。最后,我们还提出了NNGP模型的一个变体,用于自动选择邻域集大小。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b334/5894878/91b9ab00142e/nihms930114f1.jpg

相似文献

1
On nearest-neighbor Gaussian process models for massive spatial data.关于海量空间数据的最近邻高斯过程模型。
Wiley Interdiscip Rev Comput Stat. 2016 Sep-Oct;8(5):162-171. doi: 10.1002/wics.1383. Epub 2016 Aug 4.
4
High-Dimensional Bayesian Geostatistics.高维贝叶斯地质统计学
Bayesian Anal. 2017 Jun;12(2):583-614. doi: 10.1214/17-BA1056R. Epub 2017 May 16.
6
Efficient algorithms for Bayesian Nearest Neighbor Gaussian Processes.用于贝叶斯最近邻高斯过程的高效算法。
J Comput Graph Stat. 2019;28(2):401-414. doi: 10.1080/10618600.2018.1537924. Epub 2019 Apr 1.

引用本文的文献

3
Graph-constrained Analysis for Multivariate Functional Data.多元函数数据的图约束分析
J Multivar Anal. 2025 May;207. doi: 10.1016/j.jmva.2025.105428. Epub 2025 Feb 24.
5
Indexing and partitioning the spatial linear model for large data sets.为大型数据集索引和分区空间线性模型。
PLoS One. 2023 Nov 1;18(11):e0291906. doi: 10.1371/journal.pone.0291906. eCollection 2023.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验