Suppr超能文献

一种具有高时空分辨率的美国大陆地区 PM 浓度的集合模型。

An ensemble-based model of PM concentration across the contiguous United States with high spatiotemporal resolution.

机构信息

Department of Environmental Health, Harvard T.H. Chan School of Public Heath, Boston, MA, United States; Research Center for Public Health, Tsinghua University, Beijing, China.

Department of Environmental Health, Harvard T.H. Chan School of Public Heath, Boston, MA, United States.

出版信息

Environ Int. 2019 Sep;130:104909. doi: 10.1016/j.envint.2019.104909. Epub 2019 Jul 1.

Abstract

Various approaches have been proposed to model PM in the recent decade, with satellite-derived aerosol optical depth, land-use variables, chemical transport model predictions, and several meteorological variables as major predictor variables. Our study used an ensemble model that integrated multiple machine learning algorithms and predictor variables to estimate daily PM at a resolution of 1 km × 1 km across the contiguous United States. We used a generalized additive model that accounted for geographic difference to combine PM estimates from neural network, random forest, and gradient boosting. The three machine learning algorithms were based on multiple predictor variables, including satellite data, meteorological variables, land-use variables, elevation, chemical transport model predictions, several reanalysis datasets, and others. The model training results from 2000 to 2015 indicated good model performance with a 10-fold cross-validated R of 0.86 for daily PM predictions. For annual PM estimates, the cross-validated R was 0.89. Our model demonstrated good performance up to 60 μg/m. Using trained PM model and predictor variables, we predicted daily PM from 2000 to 2015 at every 1 km × 1 km grid cell in the contiguous United States. We also used localized land-use variables within 1 km × 1 km grids to downscale PM predictions to 100 m × 100 m grid cells. To characterize uncertainty, we used meteorological variables, land-use variables, and elevation to model the monthly standard deviation of the difference between daily monitored and predicted PM for every 1 km × 1 km grid cell. This PM prediction dataset, including the downscaled and uncertainty predictions, allows epidemiologists to accurately estimate the adverse health effect of PM. Compared with model performance of individual base learners, an ensemble model would achieve a better overall estimation. It is worth exploring other ensemble model formats to synthesize estimations from different models or from different groups to improve overall performance.

摘要

在过去十年中,人们提出了各种方法来模拟 PM,主要预测变量包括卫星衍生的气溶胶光学深度、土地利用变量、化学传输模型预测以及几种气象变量。我们的研究使用了一个集成模型,该模型集成了多种机器学习算法和预测变量,以 1km×1km 的分辨率估算整个美国大陆的日 PM 浓度。我们使用广义加性模型(GAM)来结合神经网络、随机森林和梯度提升的 PM 估算值,该模型考虑了地理差异。这三种机器学习算法基于多种预测变量,包括卫星数据、气象变量、土地利用变量、海拔、化学传输模型预测、多个再分析数据集等。从 2000 年到 2015 年的模型训练结果表明,该模型具有良好的性能,10 倍交叉验证的 R 为 0.86,可用于预测日 PM 浓度。对于年 PM 估算值,交叉验证的 R 为 0.89。我们的模型在高达 60μg/m 的范围内表现良好。使用训练好的 PM 模型和预测变量,我们在整个美国大陆的每个 1km×1km 网格单元中预测了从 2000 年到 2015 年的日 PM 浓度。我们还使用 1km×1km 网格内的本地化土地利用变量将 PM 预测值下转换到 100m×100m 的网格单元中。为了描述不确定性,我们使用气象变量、土地利用变量和海拔来模拟每个 1km×1km 网格单元中日监测和预测 PM 之间的月度差异的标准偏差。这个 PM 预测数据集包括下转换和不确定性预测,使得流行病学家能够准确估计 PM 对健康的不良影响。与单个基础学习者的模型性能相比,集成模型可以实现更好的整体估计。值得探索其他集成模型格式,以综合来自不同模型或不同组的估计值,从而提高整体性能。

相似文献

10
Deep Ensemble Machine Learning Framework for the Estimation of Concentrations.深度集成机器学习框架用于估算浓度。
Environ Health Perspect. 2022 Mar;130(3):37004. doi: 10.1289/EHP9752. Epub 2022 Mar 7.

引用本文的文献

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验