Suppr超能文献

用于大时空数据的不可分离动态最近邻高斯过程模型及其在颗粒物分析中的应用

NONSEPARABLE DYNAMIC NEAREST NEIGHBOR GAUSSIAN PROCESS MODELS FOR LARGE SPATIO-TEMPORAL DATA WITH AN APPLICATION TO PARTICULATE MATTER ANALYSIS.

作者信息

Datta Abhirup, Banerjee Sudipto, Finley Andrew O, Hamm Nicholas A S, Schaap Martijn

机构信息

Johns Hopkins University.

University of California, Los Angeles.

出版信息

Ann Appl Stat. 2016 Sep;10(3):1286-1316. doi: 10.1214/16-AOAS931. Epub 2016 Sep 28.

Abstract

Particulate matter (PM) is a class of malicious environmental pollutants known to be detrimental to human health. Regulatory efforts aimed at curbing PM levels in different countries often require high resolution space-time maps that can identify red-flag regions exceeding statutory concentration limits. Continuous spatio-temporal Gaussian Process (GP) models can deliver maps depicting predicted PM levels and quantify predictive uncertainty. However, GP-based approaches are usually thwarted by computational challenges posed by large datasets. We construct a novel class of scalable Dynamic Nearest Neighbor Gaussian Process (DNNGP) models that can provide a sparse approximation to any spatio-temporal GP (e.g., with nonseparable covariance structures). The DNNGP we develop here can be used as a sparsity-inducing prior for spatio-temporal random effects in any Bayesian hierarchical model to deliver full posterior inference. Storage and memory requirements for a DNNGP model are linear in the size of the dataset, thereby delivering massive scalability without sacrificing inferential richness. Extensive numerical studies reveal that the DNNGP provides substantially superior approximations to the underlying process than low-rank approximations. Finally, we use the DNNGP to analyze a massive air quality dataset to substantially improve predictions of PM levels across Europe in conjunction with the LOTOS-EUROS chemistry transport models (CTMs).

摘要

颗粒物(PM)是一类已知对人体健康有害的恶意环境污染物。不同国家旨在控制PM水平的监管措施通常需要高分辨率的时空地图,以识别超过法定浓度限值的警示区域。连续时空高斯过程(GP)模型可以提供描绘预测PM水平的地图,并量化预测不确定性。然而,基于GP的方法通常会受到大型数据集带来的计算挑战的阻碍。我们构建了一类新颖的可扩展动态最近邻高斯过程(DNNGP)模型,该模型可以为任何时空GP(例如,具有不可分离协方差结构的GP)提供稀疏近似。我们在此开发的DNNGP可以用作任何贝叶斯层次模型中时空随机效应的稀疏诱导先验,以进行完整的后验推断。DNNGP模型的存储和内存需求与数据集大小呈线性关系,从而在不牺牲推断丰富性的情况下实现了大规模的可扩展性。大量的数值研究表明,与低秩近似相比,DNNGP为基础过程提供了实质上更优的近似。最后,我们使用DNNGP结合LOTOS-EUROS化学传输模型(CTM)分析了一个大规模空气质量数据集,以大幅改进对欧洲各地PM水平的预测。

相似文献

2
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.
J Am Stat Assoc. 2016;111(514):800-812. doi: 10.1080/01621459.2015.1044091. Epub 2016 Aug 18.
3
On nearest-neighbor Gaussian process models for massive spatial data.
Wiley Interdiscip Rev Comput Stat. 2016 Sep-Oct;8(5):162-171. doi: 10.1002/wics.1383. Epub 2016 Aug 4.
4
Modeling Massive Spatial Datasets Using a Conjugate Bayesian Linear Modeling Framework.
Spat Stat. 2020 Jun;37. doi: 10.1016/j.spasta.2020.100417. Epub 2020 Feb 7.
5
Scalable Predictions for Spatial Probit Linear Mixed Models Using Nearest Neighbor Gaussian Processes.
J Data Sci. 2022;20(4):533-544. doi: 10.6339/22-jds1073. Epub 2022 Nov 3.
7
NONLINEAR PREDICTIVE LATENT PROCESS MODELS FOR INTEGRATING SPATIO-TEMPORAL EXPOSURE DATA FROM MULTIPLE SOURCES.
Ann Appl Stat. 2014 Sep;8(3):1538-1560. doi: 10.1214/14-AOAS737. Epub 2014 Oct 23.
8
Bayesian modeling and analysis for gradients in spatiotemporal processes.
Biometrics. 2015 Sep;71(3):575-84. doi: 10.1111/biom.12305. Epub 2015 Apr 20.
9
On fitting spatio-temporal disease mapping models using approximate Bayesian inference.
Stat Methods Med Res. 2014 Dec;23(6):507-30. doi: 10.1177/0962280214527528. Epub 2014 Apr 7.
10
High-Dimensional Bayesian Geostatistics.
Bayesian Anal. 2017 Jun;12(2):583-614. doi: 10.1214/17-BA1056R. Epub 2017 May 16.

引用本文的文献

1
SPATIAL PREDICTIONS ON PHYSICALLY CONSTRAINED DOMAINS: APPLICATIONS TO ARCTIC SEA SALINITY DATA.
Ann Appl Stat. 2024 Jun;18(2):1596-1617. doi: 10.1214/23-aoas1850. Epub 2024 Apr 5.
4
Multi-Source Data and Knowledge Fusion via Deep Learning for Dynamical Systems: Applications to Spatiotemporal Cardiac Modeling.
IISE Trans Healthc Syst Eng. 2025;15(1):1-14. doi: 10.1080/24725579.2024.2398592. Epub 2024 Sep 7.
5
Bayesian Regression Analysis for Dependent Data with an Elliptical Shape.
Entropy (Basel). 2024 Dec 9;26(12):1072. doi: 10.3390/e26121072.
6
Fixed-Domain Asymptotics Under Vecchia's Approximation of Spatial Process Likelihoods.
Stat Sin. 2024 Oct;34(4):1863-1881. doi: 10.5705/ss.202021.0428.
7
A DYNAMIC SPATIAL FILTERING APPROACH TO MITIGATE UNDERESTIMATION BIAS IN FIELD CALIBRATED LOW-COST SENSOR AIR POLLUTION DATA.
Ann Appl Stat. 2023 Dec;17(4):3056-3087. doi: 10.1214/23-aoas1751. Epub 2023 Oct 30.
8
BAYESIAN HIERARCHICAL MODELING AND ANALYSIS FOR ACTIGRAPH DATA FROM WEARABLE DEVICES.
Ann Appl Stat. 2023 Dec;17(4):2865-2886. doi: 10.1214/23-aoas1742. Epub 2023 Oct 30.
9
Modeling Multivariate Spatial Dependencies Using Graphical Models.
N Engl J Stat Data Sci. 2023 Sep;1(2):283-295. doi: 10.51387/23-nejsds47. Epub 2023 Sep 6.
10
A FLEXIBLE BAYESIAN FRAMEWORK TO ESTIMATE AGE- AND CAUSE-SPECIFIC CHILD MORTALITY OVER TIME FROM SAMPLE REGISTRATION DATA.
Ann Appl Stat. 2022 Mar;16(1):124-143. doi: 10.1214/21-aoas1489. Epub 2022 Mar 28.

本文引用的文献

1
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.
J Am Stat Assoc. 2016;111(514):800-812. doi: 10.1080/01621459.2015.1044091. Epub 2016 Aug 18.
2
The carcinogenicity of outdoor air pollution.
Lancet Oncol. 2013 Dec;14(13):1262-3. doi: 10.1016/s1470-2045(13)70487-x.
3
Long-term air pollution exposure and cardio- respiratory mortality: a review.
Environ Health. 2013 May 28;12(1):43. doi: 10.1186/1476-069X-12-43.
4
A comparison of reanalysis techniques: applying optimal interpolation and Ensemble Kalman Filtering to improve air quality monitoring at mesoscale.
Sci Total Environ. 2013 Aug 1;458-460:7-14. doi: 10.1016/j.scitotenv.2013.03.089. Epub 2013 Apr 29.
5
Exposure assessment for estimation of the global burden of disease attributable to outdoor air pollution.
Environ Sci Technol. 2012 Jan 17;46(2):652-60. doi: 10.1021/es2025752. Epub 2012 Jan 6.
6
Spatial mapping of ozone and SO2 trends in Europe.
Sci Total Environ. 2010 Sep 15;408(20):4795-806. doi: 10.1016/j.scitotenv.2010.06.021.
7
HIERARCHICAL SPATIAL MODELS FOR PREDICTING TREE SPECIES ASSEMBLAGES ACROSS LARGE DOMAINS.
Ann Appl Stat. 2009 Sep 1;3(3):1052-1079. doi: 10.1214/09-aoas250.
8
Gaussian predictive process models for large spatial data sets.
J R Stat Soc Series B Stat Methodol. 2008 Sep 1;70(4):825-848. doi: 10.1111/j.1467-9868.2008.00663.x.
9
Air pollution and health.
Lancet. 2002 Oct 19;360(9341):1233-42. doi: 10.1016/S0140-6736(02)11274-8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验