Suppr超能文献

时空数据的预测与模型评估。

Prediction and model evaluation for space-time data.

作者信息

Watson G L, Reid C E, Jerrett M, Telesca D

机构信息

Department of Biostatistics, University of California, Los Angeles, CA, USA.

Department of Geography, University of Colorado, Boulder, CO, USA.

出版信息

J Appl Stat. 2023 Sep 3;51(10):2007-2024. doi: 10.1080/02664763.2023.2252208. eCollection 2024.

Abstract

Evaluation metrics for prediction error, model selection and model averaging on space-time data are understudied and poorly understood. The absence of independent replication makes prediction ambiguous as a concept and renders evaluation procedures developed for independent data inappropriate for most space-time prediction problems. Motivated by air pollution data collected during California wildfires in 2008, this manuscript attempts a formalization of the true prediction error associated with spatial interpolation. We investigate a variety of cross-validation (CV) procedures employing both simulations and case studies to provide insight into the nature of the estimand targeted by alternative data partition strategies. Consistent with recent best practice, we find that location-based cross-validation is appropriate for estimating spatial interpolation error as in our analysis of the California wildfire data. Interestingly, commonly held notions of bias-variance trade-off of CV fold size do not trivially apply to dependent data, and we recommend leave-one-location-out (LOLO) CV as the preferred prediction error metric for spatial interpolation.

摘要

用于时空数据预测误差、模型选择和模型平均的评估指标研究不足且理解不深。缺乏独立复制使得预测作为一个概念具有模糊性,并且使得为独立数据开发的评估程序不适用于大多数时空预测问题。受2008年加利福尼亚野火期间收集的空气污染数据的启发,本手稿尝试对与空间插值相关的真实预测误差进行形式化。我们研究了各种交叉验证(CV)程序,同时使用模拟和案例研究,以深入了解替代数据划分策略所针对的估计量的性质。与最近的最佳实践一致,我们发现在对加利福尼亚野火数据的分析中,基于位置的交叉验证适用于估计空间插值误差。有趣的是,关于CV折数大小的偏差-方差权衡的普遍观念并不能简单地应用于相关数据,并且我们推荐留一位置法(LOLO)CV作为空间插值的首选预测误差度量。

相似文献

1
Prediction and model evaluation for space-time data.时空数据的预测与模型评估。
J Appl Stat. 2023 Sep 3;51(10):2007-2024. doi: 10.1080/02664763.2023.2252208. eCollection 2024.

本文引用的文献

2
A Case Study Competition Among Methods for Analyzing Large Spatial Data.大型空间数据分析方法的案例研究竞赛
J Agric Biol Environ Stat. 2019;24(3):398-425. doi: 10.1007/s13253-018-00348-w. Epub 2018 Dec 14.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验