Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany; Institute for Medical Information Processing, Biometry and Epidemiology (IBE), Faculty of Medicine, LMU Munich, Pettenkofer School of Public Health, Munich, Germany.
Climate Service Center Germany (GERICS), Helmholtz-Zentrum Hereon, Hamburg, Germany.
Environ Res. 2023 Dec 1;238(Pt 2):117173. doi: 10.1016/j.envres.2023.117173. Epub 2023 Sep 20.
The lack of readily available methods for estimating high-resolution near-surface relative humidity (RH) and the incapability of weather stations to fully capture the spatiotemporal variability can lead to exposure misclassification in studies of environmental epidemiology. We therefore aimed to predict German-wide 1 × 1 km daily mean RH during 2000-2021. RH observations, longitude and latitude, modelled air temperature, precipitation and wind speed as well as remote sensing information on topographic elevation, vegetation, and the true color band composite were incorporated in a Random Forest (RF) model, in addition to date for capturing the temporal variations of the response-explanatory variables relationship. The model achieved high accuracy (R = 0.83) and low errors (Root Mean Square Error (RMSE) of 5.07%, Mean Absolute Percentage Error (MAPE) of 5.19% and Mean Percentage Error (MPE) of - 0.53%), calculated via ten-fold cross-validation. A comparison of our RH predictions with measurements from a dense monitoring network in the city of Augsburg, South Germany confirmed the good performance (R ≥ 0.86, RMSE ≤ 5.45%, MAPE ≤ 5.59%, MPE ≤ 3.11%). The model displayed high German-wide RH (22y-average of 79.00%) and high spatial variability across the country, exceeding 12% on yearly averages. Our findings indicate that the proposed RF model is suitable for estimating RH for a whole country in high-resolution and provide a reliable RH dataset for epidemiological analyses and other environmental research purposes.
缺乏简便可用的方法来估计高分辨率近地表相对湿度(RH),以及气象站无法充分捕捉时空变异性,这可能导致环境流行病学研究中的暴露分类错误。因此,我们旨在预测 2000-2021 年德国全国范围内的 1×1km 日平均 RH。RH 观测值、经纬度、模型化的空气温度、降水和风速以及地形海拔、植被和真彩色波段组合的遥感信息,以及用于捕捉响应-解释变量关系的时间变化的日期,均被纳入随机森林(RF)模型中。该模型具有较高的准确性(R=0.83)和较低的误差(均方根误差(RMSE)为 5.07%,平均绝对百分比误差(MAPE)为 5.19%,平均百分比误差(MPE)为-0.53%),通过十折交叉验证计算得出。我们的 RH 预测与德国南部奥格斯堡市密集监测网络的测量结果进行比较,证实了该模型的良好性能(R≥0.86,RMSE≤5.45%,MAPE≤5.59%,MPE≤3.11%)。该模型显示出德国全国范围内高 RH(22 年平均为 79.00%)和高空间变异性,在年平均值上超过 12%。我们的研究结果表明,所提出的 RF 模型适合于以高分辨率估计整个国家的 RH,并为流行病学分析和其他环境研究目的提供可靠的 RH 数据集。