Zhou Hongye, Zhang Feng, Du Zhenhong, Liu Renyi
School of Earth Sciences, Zhejiang University, Hangzhou, 310027, China.
School of Earth Sciences, Zhejiang University, Hangzhou, 310027, China; Zhejiang Provincial Key Laboratory of Geographic Information Science, Hangzhou, 310028, China.
Environ Pollut. 2021 Jan 19;273:116473. doi: 10.1016/j.envpol.2021.116473.
Air pollution is a complex process and is affected by meteorological conditions and other chemical components. Numerous studies have demonstrated that data-driven spatio-temporal prediction models of PM concentration are comparable with the model-driven model. However, data-driven models are usually depending on the statistical correlation between PM and other factors and have challenges in dealing with causality in complex systems. In this paper, we argue that domain knowledge should be incorporated into data-driven models to enhance prediction accuracy and make the model more physically realistic. We focus on the influence of dynamic wind-field on PM concentration distribution and fuse the pollution diffusion distance with the deep learning model based on a wind-field surface. In order to model spatial dependence between monitoring stations, which is dynamic and anisotropic because of the wind-field, we proposed a hybrid deep learning framework, dynamic directed spatio-temporal graph convolution networks (DD-STGCN). It expanded the ability to deal with space-time prediction in the continuous and dynamic wind-field. We used a directed graph time-series to describe the vertex state and topological relationship between vertices and replaced traditional Euclidean distance with wind-field diffusion distance to describe the proximity relationship between vertices. Our experiment results demonstrated that the DD-STGCN model achieved a better prediction ability than LSTM, GC-LSTM, and STGCN models. Compared to the best comparison model, MAPE, MAE, and RMSE were improved by 10.2%, 9.7%, and 9.6% in 12 h on an average, respectively. The performance of our model was further tested during a haze period. In the case that two models both considered the effect of wind, compared with the pure data-driven model, our model performed better in prediction distribution and showed the benefit of spatial interpretability provided by domain knowledge.
空气污染是一个复杂的过程,受到气象条件和其他化学成分的影响。众多研究表明,基于数据驱动的PM浓度时空预测模型与基于模型驱动的模型具有可比性。然而,数据驱动模型通常依赖于PM与其他因素之间的统计相关性,在处理复杂系统中的因果关系方面存在挑战。在本文中,我们认为应将领域知识纳入数据驱动模型,以提高预测准确性并使模型更具物理现实性。我们关注动态风场对PM浓度分布的影响,并基于风场曲面将污染扩散距离与深度学习模型相融合。为了对监测站之间的空间依赖性进行建模,由于风场的存在,这种依赖性是动态且各向异性的,我们提出了一种混合深度学习框架,即动态有向时空图卷积网络(DD-STGCN)。它扩展了在连续动态风场中处理时空预测的能力。我们使用有向图时间序列来描述顶点状态和顶点之间的拓扑关系,并用风场扩散距离取代传统的欧几里得距离来描述顶点之间的邻近关系。我们的实验结果表明,DD-STGCN模型比LSTM、GC-LSTM和STGCN模型具有更好的预测能力。与最佳对比模型相比,平均而言,在12小时内,平均绝对百分比误差(MAPE)、平均绝对误差(MAE)和均方根误差(RMSE)分别提高了10.2%、9.7%和9.6%。我们的模型在雾霾期间进一步进行了性能测试。在两个模型都考虑风的影响的情况下,与纯数据驱动模型相比,我们的模型在预测分布方面表现更好,并显示了领域知识提供的空间可解释性的优势。