Sampson Paul D, Richards Mark, Szpiro Adam A, Bergen Silas, Sheppard Lianne, Larson Timothy V, Kaufman Joel D
Department of Statistics, University of Washington, Box 354322, Seattle, WA 98195-4322, USA.
Atmos Environ (1994). 2013 Aug 1;75:383-392. doi: 10.1016/j.atmosenv.2013.04.015.
Many cohort studies in environmental epidemiology require accurate modeling and prediction of fine scale spatial variation in ambient air quality across the U.S. This modeling requires the use of small spatial scale geographic or "land use" regression covariates and some degree of spatial smoothing. Furthermore, the details of the prediction of air quality by land use regression and the spatial variation in ambient air quality not explained by this regression should be allowed to vary across the continent due to the large scale heterogeneity in topography, climate, and sources of air pollution. This paper introduces a regionalized national universal kriging model for annual average fine particulate matter (PM) monitoring data across the U.S. To take full advantage of an extensive database of land use covariates we chose to use the method of Partial Least Squares, rather than variable selection, for the regression component of the model (the "universal" in "universal kriging") with regression coefficients and residual variogram models allowed to vary across three regions defined as West Coast, Mountain West, and East. We demonstrate a very high level of cross-validated accuracy of prediction with an overall of 0.88 and well-calibrated predictive intervals. In accord with the spatially varying characteristics of PM on a national scale and differing kriging smoothness parameters, the accuracy of the prediction varies by region with predictive intervals being notably wider in the West Coast and Mountain West in contrast to the East.
环境流行病学中的许多队列研究需要对美国各地环境空气质量的精细尺度空间变化进行准确建模和预测。这种建模需要使用小空间尺度的地理或“土地利用”回归协变量以及一定程度的空间平滑处理。此外,由于地形、气候和空气污染来源的大规模异质性,土地利用回归对空气质量的预测细节以及该回归未解释的环境空气质量空间变化应允许在整个大陆有所不同。本文介绍了一种针对美国年度平均细颗粒物(PM)监测数据的区域化全国通用克里金模型。为了充分利用广泛的土地利用协变量数据库,我们选择使用偏最小二乘法,而不是变量选择法,用于模型的回归部分(“通用克里金”中的“通用”),回归系数和残差变异函数模型允许在定义为西海岸、美国西部山区和东部的三个区域有所不同。我们展示了非常高的交叉验证预测准确性,整体R值为0.88,预测区间校准良好。与全国范围内PM的空间变化特征和不同的克里金平滑度参数一致,预测准确性因地区而异,西海岸和美国西部山区的预测区间明显比东部更宽。