• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大型空间数据分析方法的案例研究竞赛

A Case Study Competition Among Methods for Analyzing Large Spatial Data.

作者信息

Heaton Matthew J, Datta Abhirup, Finley Andrew O, Furrer Reinhard, Guinness Joseph, Guhaniyogi Rajarshi, Gerber Florian, Gramacy Robert B, Hammerling Dorit, Katzfuss Matthias, Lindgren Finn, Nychka Douglas W, Sun Furong, Zammit-Mangion Andrew

机构信息

Brigham Young University, Provo, UT USA.

出版信息

J Agric Biol Environ Stat. 2019;24(3):398-425. doi: 10.1007/s13253-018-00348-w. Epub 2018 Dec 14.

DOI:10.1007/s13253-018-00348-w
PMID:31496633
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6709111/
Abstract

UNLABELLED

The Gaussian process is an indispensable tool for spatial data analysts. The onset of the "big data" era, however, has lead to the traditional Gaussian process being computationally infeasible for modern spatial data. As such, various alternatives to the full Gaussian process that are more amenable to handling big spatial data have been proposed. These modern methods often exploit low-rank structures and/or multi-core and multi-threaded computing environments to facilitate computation. This study provides, first, an introductory overview of several methods for analyzing large spatial data. Second, this study describes the results of a predictive competition among the described methods as implemented by different groups with strong expertise in the methodology. Specifically, each research group was provided with two training datasets (one simulated and one observed) along with a set of prediction locations. Each group then wrote their own implementation of their method to produce predictions at the given location and each was subsequently run on a common computing environment. The methods were then compared in terms of various predictive diagnostics. Supplementary materials regarding implementation details of the methods and code are available for this article online.

ELECTRONIC SUPPLEMENTARY MATERIAL

Supplementary materials for this article are available at 10.1007/s13253-018-00348-w.

摘要

未标注

高斯过程是空间数据分析中不可或缺的工具。然而,“大数据”时代的到来使得传统高斯过程在处理现代空间数据时计算上变得不可行。因此,人们提出了各种更适合处理大规模空间数据的全高斯过程替代方法。这些现代方法通常利用低秩结构和/或多核多线程计算环境来促进计算。本研究首先对几种分析大型空间数据的方法进行了介绍性概述。其次,本研究描述了由不同方法学专业团队实现的所述方法之间预测竞赛的结果。具体而言,为每个研究团队提供了两个训练数据集(一个模拟数据集和一个观测数据集)以及一组预测位置。然后,每个团队编写自己的方法实现代码,以便在给定位置进行预测,随后每个实现代码在一个通用计算环境上运行。然后根据各种预测诊断对这些方法进行比较。本文在线提供了有关方法实现细节和代码的补充材料。

电子补充材料

本文的补充材料可在10.1007/s13253-018-00348-w获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7539/6709111/f61909b03678/13253_2018_348_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7539/6709111/0abe754a2ec3/13253_2018_348_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7539/6709111/d316e69d4c3b/13253_2018_348_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7539/6709111/f61909b03678/13253_2018_348_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7539/6709111/0abe754a2ec3/13253_2018_348_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7539/6709111/d316e69d4c3b/13253_2018_348_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7539/6709111/f61909b03678/13253_2018_348_Fig3_HTML.jpg

相似文献

1
A Case Study Competition Among Methods for Analyzing Large Spatial Data.大型空间数据分析方法的案例研究竞赛
J Agric Biol Environ Stat. 2019;24(3):398-425. doi: 10.1007/s13253-018-00348-w. Epub 2018 Dec 14.
2
Discussion on "Competition on Spatial Statistics for Large Datasets".关于“大数据集空间统计中的竞争”的讨论
J Agric Biol Environ Stat. 2021;26(4):604-611. doi: 10.1007/s13253-021-00462-2. Epub 2021 Jul 23.
3
Meta-Kriging: Scalable Bayesian Modeling and Inference for Massive Spatial Datasets.元克里金法:针对大规模空间数据集的可扩展贝叶斯建模与推理
Technometrics. 2018;60(4):430-444. doi: 10.1080/00401706.2018.1437474. Epub 2018 Jun 6.
4
The integrated nested Laplace approximation applied to spatial log-Gaussian Cox process models.应用于空间对数高斯考克斯过程模型的集成嵌套拉普拉斯近似法。
J Appl Stat. 2022 Jan 7;50(5):1128-1151. doi: 10.1080/02664763.2021.2023116. eCollection 2023.
5
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
6
Asynchronous Changepoint Estimation for Spatially Correlated Functional Time Series.空间相关函数时间序列的异步变点估计
J Agric Biol Environ Stat. 2023;28(1):157-176. doi: 10.1007/s13253-022-00519-w. Epub 2022 Oct 18.
7
A flexible Bayesian nonconfounding spatial model for analysis of dispersed count data.用于离散计数数据分析的灵活贝叶斯非混杂空间模型。
Biom J. 2022 Apr;64(4):758-770. doi: 10.1002/bimj.202100157. Epub 2022 Jan 5.
8
MS-REDUCE: an ultrafast technique for reduction of big mass spectrometry data for high-throughput processing.MS-REDUCE:一种用于减少大量质谱数据以进行高通量处理的超快速技术。
Bioinformatics. 2016 May 15;32(10):1518-26. doi: 10.1093/bioinformatics/btw023. Epub 2016 Jan 21.
9
Practical Bayesian Modeling and Inference for Massive Spatial Datasets On Modest Computing Environments.在适度计算环境下对海量空间数据集进行实用贝叶斯建模与推断。
Stat Anal Data Min. 2019 Jun;12(3):197-209. doi: 10.1002/sam.11413. Epub 2019 Apr 23.
10
Gaussian Process Regression Plus Method for Localization Reliability Improvement.用于提高定位可靠性的高斯过程回归增强方法
Sensors (Basel). 2016 Jul 29;16(8):1193. doi: 10.3390/s16081193.

引用本文的文献

1
: a standardized database of scientific trawl surveys in the Northeast Pacific Ocean.东北太平洋科学拖网调查的标准化数据库。
PeerJ. 2025 Sep 3;13:e19964. doi: 10.7717/peerj.19964. eCollection 2025.
2
Nonstationary Spatial Process Models with Spatially Varying Covariance Kernels.具有空间变化协方差核的非平稳空间过程模型。
J Comput Graph Stat. 2025 Jul 30. doi: 10.1080/10618600.2025.2516020.
3
Spatial meshing for general Bayesian multivariate models.一般贝叶斯多元模型的空间网格划分

本文引用的文献

1
Spatial Factor Models for High-Dimensional and Large Spatial Data: An Application in Forest Variable Mapping.高维大空间数据的空间因子模型:在森林变量制图中的应用
Stat Sin. 2019;29:1155-1180. doi: 10.5705/ss.202018.0005.
2
When Gaussian Process Meets Big Data: A Review of Scalable GPs.当高斯过程遇上大数据:可扩展高斯过程综述
IEEE Trans Neural Netw Learn Syst. 2020 Nov;31(11):4405-4423. doi: 10.1109/TNNLS.2019.2957109. Epub 2020 Oct 29.
3
Efficient algorithms for Bayesian Nearest Neighbor Gaussian Processes.用于贝叶斯最近邻高斯过程的高效算法。
J Mach Learn Res. 2024 Mar;25.
4
Distributed Heterogeneity Learning for Generalized Partially Linear Models with Spatially Varying Coefficients.具有空间变化系数的广义部分线性模型的分布式异质性学习
J Am Stat Assoc. 2025;120(550):779-793. doi: 10.1080/01621459.2024.2359131. Epub 2024 Jun 28.
5
Radial Neighbors for Provably Accurate Scalable Approximations of Gaussian Processes.用于高斯过程可证精确可扩展近似的径向邻域
Biometrika. 2024 Dec;111(4):1151-1167. doi: 10.1093/biomet/asae029. Epub 2024 Jun 14.
6
Spatial interpolation of cropland soil bulk density by increasing soil samples with filled missing values.通过增加带有填充缺失值的土壤样本来对农田土壤容重进行空间插值。
Sci Rep. 2025 Mar 7;15(1):8008. doi: 10.1038/s41598-025-91335-y.
7
Modeling lake conductivity in the contiguous United States using spatial indexing for big spatial data.利用大空间数据的空间索引对美国本土湖泊电导率进行建模。
Spat Stat. 2024 Mar;59. doi: 10.1016/j.spasta.2023.100808.
8
Direct Bayesian linear regression for distribution-valued covariates.针对分布值协变量的直接贝叶斯线性回归
Electron J Stat. 2024;18(2):3327-3375. doi: 10.1214/24-ejs2275. Epub 2024 Aug 27.
9
Fixed-Domain Asymptotics Under Vecchia's Approximation of Spatial Process Likelihoods.基于空间过程似然函数的 Vecchia 近似下的固定域渐近性
Stat Sin. 2024 Oct;34(4):1863-1881. doi: 10.5705/ss.202021.0428.
10
Prediction and model evaluation for space-time data.时空数据的预测与模型评估。
J Appl Stat. 2023 Sep 3;51(10):2007-2024. doi: 10.1080/02664763.2023.2252208. eCollection 2024.
J Comput Graph Stat. 2019;28(2):401-414. doi: 10.1080/10618600.2018.1537924. Epub 2019 Apr 1.
4
Spectral density estimation for random fields via periodic embeddings.通过周期嵌入对随机场进行谱密度估计。
Biometrika. 2019 Jun;106(2):267-286. doi: 10.1093/biomet/asz004. Epub 2019 Apr 3.
5
Spatial mapping with Gaussian processes and nonstationary Fourier features.使用高斯过程和非平稳傅里叶特征的空间映射。
Spat Stat. 2018 Dec;28:59-78. doi: 10.1016/j.spasta.2018.02.002.
6
Meta-Kriging: Scalable Bayesian Modeling and Inference for Massive Spatial Datasets.元克里金法:针对大规模空间数据集的可扩展贝叶斯建模与推理
Technometrics. 2018;60(4):430-444. doi: 10.1080/00401706.2018.1437474. Epub 2018 Jun 6.
7
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.用于大型地理统计数据集的分层最近邻高斯过程模型。
J Am Stat Assoc. 2016;111(514):800-812. doi: 10.1080/01621459.2015.1044091. Epub 2016 Aug 18.
8
On nearest-neighbor Gaussian process models for massive spatial data.关于海量空间数据的最近邻高斯过程模型。
Wiley Interdiscip Rev Comput Stat. 2016 Sep-Oct;8(5):162-171. doi: 10.1002/wics.1383. Epub 2016 Aug 4.
9
NONSEPARABLE DYNAMIC NEAREST NEIGHBOR GAUSSIAN PROCESS MODELS FOR LARGE SPATIO-TEMPORAL DATA WITH AN APPLICATION TO PARTICULATE MATTER ANALYSIS.用于大时空数据的不可分离动态最近邻高斯过程模型及其在颗粒物分析中的应用
Ann Appl Stat. 2016 Sep;10(3):1286-1316. doi: 10.1214/16-AOAS931. Epub 2016 Sep 28.
10
A multivariate spatial mixture model for areal data: examining regional differences in standardized test scores.一种用于区域数据的多元空间混合模型:检验标准化考试成绩的区域差异。
J R Stat Soc Ser C Appl Stat. 2014 Nov;63(5):737-761. doi: 10.1111/rssc.12061.