Department of Environmental Health, Harvard University, TH Chan School of Public Health, Boston, Massachusetts 02115, United States.
School of Public Policy and Government, Fundação Getúlio Vargas, Brasília, Distrito Federal 72125590, Brazil.
Environ Sci Technol. 2020 Sep 15;54(18):11037-11047. doi: 10.1021/acs.est.0c01791. Epub 2020 Sep 1.
In this paper, we integrated multiple types of predictor variables and three types of machine learners (neural network, random forest, and gradient boosting) into a geographically weighted ensemble model to estimate the daily maximum 8 h O with high resolution over both space (at 1 km × 1 km grid cells covering the contiguous United States) and time (daily estimates between 2000 and 2016). We further quantify monthly model uncertainty for our 1 km × 1 km gridded domain. The results demonstrate high overall model performance with an average cross-validated (coefficient of determination) against observations of 0.90 and 0.86 for annual averages. Overall, the model performance of the three machine learning algorithms was quite similar. The overall model performance from the ensemble model outperformed those from any single algorithm. The East North Central region of the United States had the highest , 0.93, and performance was weakest for the western mountainous regions ( of 0.86) and New England ( of 0.87). For the cross validation by season, our model had the best performance during summer with an of 0.88. This study can be useful for the environmental health community to more accurately estimate the health impacts of O over space and time, especially in health studies at an intra-urban scale.
在本文中,我们将多种类型的预测变量和三种类型的机器学习算法(神经网络、随机森林和梯度提升)集成到一个地理加权集成模型中,以高精度(空间上覆盖美国大陆的 1km×1km 网格单元,时间上为 2000 年至 2016 年的每日估计值)估算每日最大 8 小时臭氧(O)浓度。我们进一步量化了我们 1km×1km 网格化区域的每月模型不确定性。结果表明,该模型具有很高的整体性能,对观测值的平均交叉验证(决定系数)为 0.90,年平均值为 0.86。总体而言,三种机器学习算法的模型性能非常相似。集成模型的整体性能优于任何单一算法。美国中北部地区的臭氧浓度最高,为 0.93,而西部多山地区(为 0.86)和新英格兰地区(为 0.87)的性能最弱。对于按季节进行的交叉验证,我们的模型在夏季的表现最佳,其为 0.88。本研究可用于环境卫生界,以更准确地估算臭氧在空间和时间上的健康影响,特别是在城市内部尺度的健康研究中。