Department of Bioenvironmental Systems Engineering, National Taiwan University, Taipei 10617, Taiwan.
Department of Water Resources and Environmental Engineering, Tamkang University, New Taipei City 25137, Taiwan.
Sci Total Environ. 2020 Sep 20;736:139656. doi: 10.1016/j.scitotenv.2020.139656. Epub 2020 May 23.
The complex mixtures of local emission sources and regional transportations of air pollutants make accurate PM2.5 prediction a very challenging yet crucial task, especially under high pollution conditions. A symbolic representation of spatio-temporal PM2.5 features is the key to effective air pollution regulatory plans that notify the public to take necessary precautions against air pollution. The self-organizing map (SOM) can cluster high-dimensional datasets to form a meaningful topological map. This study implements the SOM to effectively extract and clearly distinguish the spatio-temporal features of long-term regional PM2.5 concentrations in a visible two-dimensional topological map. The spatial distribution of the configured topological map spans the long-term datasets of 25 monitoring stations in northern Taiwan using the Kriging method, and the temporal behavior of PM2.5 concentrations at various time scales (i.e., yearly, seasonal, and hourly) are explored in detail. Finally, we establish a machine learning model to predict PM2.5 concentrations for high pollution events. The analytical results indicate that: (1) high population density and heavy traffic load correspond to high PM2.5 concentrations; (2) the change of seasons brings obvious effects on PM2.5 concentration variation; and (3) the key input variables of the prediction model identified by the Gamma Test can improve model's reliability and accuracy for multi-step-ahead PM2.5 prediction. The results demonstrated that machine learning techniques can skillfully summarize and visibly present the clusted spatio-temporal PM2.5 features as well as improve air quality prediction accuracy.
本地排放源和空气污染物区域传输的复杂混合物使得准确预测 PM2.5 成为一项极具挑战性但至关重要的任务,尤其是在高污染条件下。对 PM2.5 时空特征进行符号表示是制定有效空气污染监管计划的关键,该计划可以通知公众采取必要的预防措施来应对空气污染。自组织映射(SOM)可以对高维数据集进行聚类,形成有意义的拓扑图。本研究通过 SOM 来有效提取和清晰区分台湾北部长期区域 PM2.5 浓度的时空特征,并在可视二维拓扑图中进行展示。使用克里金方法配置拓扑图的空间分布跨越了台湾北部 25 个监测站的长期数据集,并详细探讨了 PM2.5 浓度在各种时间尺度(即年度、季节性和小时性)的时间行为。最后,我们建立了一个机器学习模型来预测高污染事件下的 PM2.5 浓度。分析结果表明:(1)人口密度高和交通流量大对应着 PM2.5 浓度高;(2)季节变化对 PM2.5 浓度变化有明显影响;(3)伽马检验确定的预测模型的关键输入变量可以提高模型对多步 PM2.5 预测的可靠性和准确性。结果表明,机器学习技术可以巧妙地总结和直观地呈现聚类的时空 PM2.5 特征,并提高空气质量预测的准确性。