Swiss Tropical and Public Health Institute, Allschwil, Switzerland; University of Basel, Basel, Switzerland.
Federal Office of Meteorology and Climatology MeteoSwiss, Switzerland.
Sci Total Environ. 2024 Jan 1;906:167286. doi: 10.1016/j.scitotenv.2023.167286. Epub 2023 Sep 22.
High concentrations of airborne pollen trigger seasonal allergies and possibly more severe adverse respiratory and cardiovascular health events. Predicting pollen concentration accurately is valuable for epidemiological studies, in order to study the effects of pollen exposure. We aimed to develop a spatiotemporal machine learning model predicting daily pollen concentrations at a spatial resolution of 1 × 1 km across Switzerland between 2000 and 2019. Daily pollen concentrations for five common, highly allergenic pollen types (hazel, alder, birch ash, and grasses) were available from fourteen measurement sites across Switzerland. We considered several predictors such as elevation, species distribution, wind speed, wind direction, temperature, precipitation, relative humidity, satellite-observed Normalized Difference Vegetation Index, and land-use (CORINE, Landsat satellite) to explain variation in pollen concentration. We employed feature engineering techniques to encode categorical variables and fill in missing values. We applied a random forest machine learning model with 5-fold cross-validation. The 5th-99th percentiles for concentrations of hazel, alder, birch, ash, and grass pollen at the pollen monitoring stations were 0-298, 0-306, 0-1153, 0-800, and 0-290 pollen grains/m, respectively. The results of a predictive model for these concentrations yielded overall R values of 0.87, 0.84, 0.89, 0.88, and 0.91, and temporal root mean squared errors (RMSEs) of 16.07, 16.72, 69.04, 41.50, and 22.45 pollen grains/m. An analysis of predictor variable importance indicates that the average national daily pollen concentration is the most important predictor of pollen concentrations for all pollen types. Furthermore, meteorological variables including temperature, total precipitation, humidity, boundary layer height, wind speed, and wind direction, as well as date and satellite features, are important factors in pollen concentration prediction. These spatiotemporal pollen models will serve to estimate individual residential pollen exposure for epidemiological studies. Resulting estimates will enable us to study respiratory and cardiovascular mortality and hospital admissions in Switzerland.
高浓度的空气花粉会引发季节性过敏,甚至可能导致更严重的呼吸道和心血管健康事件。准确预测花粉浓度对于流行病学研究很有价值,以便研究花粉暴露的影响。我们的目的是开发一个时空机器学习模型,以预测 2000 年至 2019 年期间瑞士各地空间分辨率为 1x1 公里的每日花粉浓度。瑞士各地的 14 个测量站点提供了五种常见的、高度致敏的花粉类型(榛树、桤木、桦木、蒿属和草类)的每日花粉浓度数据。我们考虑了一些预测因子,如海拔、物种分布、风速、风向、温度、降水、相对湿度、卫星观测的归一化植被指数和土地利用(CORINE、Landsat 卫星),以解释花粉浓度的变化。我们采用了特征工程技术来对分类变量进行编码和填补缺失值。我们应用了随机森林机器学习模型,并进行了 5 折交叉验证。在花粉监测站,榛树、桤木、桦木、蒿属和草类花粉浓度的第 5-99 百分位数分别为 0-298、0-306、0-1153、0-800 和 0-290 花粉粒/m。这些浓度的预测模型的结果产生了 0.87、0.84、0.89、0.88 和 0.91 的总体 R 值,以及 16.07、16.72、69.04、41.50 和 22.45 花粉粒/m 的时间均方根误差(RMSE)。对预测变量重要性的分析表明,平均全国每日花粉浓度是所有花粉类型花粉浓度的最重要预测因子。此外,气象变量(包括温度、总降水量、湿度、边界层高度、风速和风向)以及日期和卫星特征,也是花粉浓度预测的重要因素。这些时空花粉模型将用于估计个体的居住花粉暴露量,以进行流行病学研究。由此产生的估计值将使我们能够研究瑞士的呼吸道和心血管死亡率和住院率。