Universidad Politécnica de Madrid (UPM). ETSII-UPM, José Gutiérrez Abascal 2, 28006, Madrid, Spain.
University of Castilla-La Mancha. Institute of Environmental Sciences (Botany), Avda. Carlos III s/n, E-45071, Toledo, Spain.
Int J Biometeorol. 2021 Apr;65(4):541-554. doi: 10.1007/s00484-020-02047-z. Epub 2020 Nov 13.
Air pollution in large cities produces numerous diseases and even millions of deaths annually according to the World Health Organization. Pollen exposure is related to allergic diseases, which makes its prediction a valuable tool to assess the risk level to aeroallergens. However, airborne pollen concentrations are difficult to predict due to the inherent complexity of the relationships among both biotic and environmental variables. In this work, a stochastic approach based on supervised machine learning algorithms was performed to forecast the daily Olea pollen concentrations in the Community of Madrid, central Spain, from 1993 to 2018. Firstly, individual Light Gradient Boosting Machine (LightGBM) and artificial neural network (ANN) models were applied to predict the day of the year (DOY) when the peak of the pollen season occurs, resulting the estimated average peak date 149.1 ± 9.3 and 150.1 ± 10.8 DOY for LightGBM and ANN, respectively, close to the observed value (148.8 ± 9.8). Secondly, the daily pollen concentrations during the entire pollen season have been calculated using an ensemble of two-step GAM followed by LightGBM and ANN. The results of the prediction of daily pollen concentrations showed a coefficient of determination (r) above 0.75 (goodness of the model following cross-validation). The predictors included in the ensemble models were meteorological variables, phenological metrics, specific site-characteristics, and preceding pollen concentrations. The models are state-of-the-art in machine learning and their potential has been shown to be used and deployed to understand and to predict the pollen risk levels during the main olive pollen season.
世界卫生组织称,大城市的空气污染每年导致许多疾病甚至数百万人死亡。花粉暴露与过敏疾病有关,这使得对其进行预测成为评估空气过敏原风险水平的一种有价值的工具。然而,由于生物和环境变量之间关系的固有复杂性,空气中花粉浓度很难预测。在这项工作中,采用了基于有监督机器学习算法的随机方法,对 1993 年至 2018 年西班牙马德里社区的每日油橄榄花粉浓度进行预测。首先,应用了个体 Light Gradient Boosting Machine(LightGBM)和人工神经网络(ANN)模型来预测花粉季节高峰期的日期(DOY),得到的估计平均高峰期日期分别为 LightGBM 和 ANN 的 149.1±9.3 和 150.1±10.8 DOY,接近观察值(148.8±9.8)。其次,使用两步 GAM 的集合随后是 LightGBM 和 ANN 来计算整个花粉季节的每日花粉浓度。每日花粉浓度预测的结果显示,经过交叉验证后的模型拟合度(r)超过 0.75(良好)。集合模型中包含的预测因子包括气象变量、物候指标、特定地点特征和前序花粉浓度。这些模型在机器学习中处于领先地位,其潜力已被证明可用于理解和预测主要橄榄花粉季节的花粉风险水平。