Seo Yun Am, Kim Kyu Rang, Cho Changbum, Oh Jae Won, Kim Tae Hee
AI Weather Forecast Research Team, National Institute of Meteorological Science, Seogwipo, Korea.
Applied Meteorology Research Division, National Institute of Meteorological Science, Seogwipo, Korea.
Allergy Asthma Immunol Res. 2020 Jan;12(1):149-163. doi: 10.4168/aair.2020.12.1.149.
Oak is the dominant tree species in Korea. Oak pollen has the highest sensitivity rate among all allergenic tree species in Korea. A deep neural network (DNN)-based estimation model was developed to determine the concentration of oak pollen and overcome the shortcomings of conventional regression models.
The DNN model proposed in this study utilized weather factors as the input and provided pollen concentrations as the output. Weather and pollen concentration data were used from 2007 to 2016 obtained from the Korea Meteorological Administration pollen observation network. Because it is difficult to prevent over-fitting and underestimation by using a DNN model alone, we developed a bootstrap aggregating-type ensemble model. Each of the 30 ensemble members was trained with random sampling at a fixed rate according to the pollen risk grade. To verify the effectiveness of the proposed model, we compared its performance with those of models of regression and support vector regression (SVR) under the same conditions, with respect to the prediction of pollen concentrations, risk levels, and season length.
The mean absolute percentage error in the estimated pollen concentrations was 11.18%, 10.37%, and 5.04% for the regression, SVR and DNN models, respectively. The start of the pollen season was estimated to be 20, 22, and 6 days earlier than that predicted by the regression, SVR and DNN models, respectively. Similarly, the end of the pollen season was estimated to be 33, 20, and 9 days later that predicted by the regression, SVR and DNN models, respectively.
Overall, the DNN model performed better than the other models. However, the prediction of peak pollen concentrations needs improvement. Improved observation quality with optimization of the DNN model will resolve this issue.
橡树是韩国的主要树种。在韩国所有致敏树种中,橡树花粉的致敏率最高。为了确定橡树花粉浓度并克服传统回归模型的缺点,开发了一种基于深度神经网络(DNN)的估计模型。
本研究提出的DNN模型以天气因素作为输入,并输出花粉浓度。使用了2007年至2016年从韩国气象厅花粉观测网络获得的天气和花粉浓度数据。由于仅使用DNN模型难以防止过拟合和低估,因此我们开发了一种自助聚合型集成模型。根据花粉风险等级,以固定比率对30个集成成员中的每一个进行随机抽样训练。为了验证所提出模型的有效性,我们在相同条件下,就花粉浓度、风险水平和季节长度的预测,将其性能与回归模型和支持向量回归(SVR)模型的性能进行了比较。
回归模型、SVR模型和DNN模型估计花粉浓度的平均绝对百分比误差分别为11.18%、10.37%和5.04%。花粉季节开始时间的估计分别比回归模型、SVR模型和DNN模型预测的时间早20天、22天和6天。同样,花粉季节结束时间的估计分别比回归模型、SVR模型和DNN模型预测的时间晚33天、20天和9天。
总体而言,DNN模型的表现优于其他模型。然而,花粉浓度峰值的预测仍需改进。通过优化DNN模型提高观测质量将解决这一问题。