Chen Shuixia, Xu Zeshui, Wang Xinxin, Zhang Chenxi
Business School, Sichuan University, Chengdu 610064, China.
Knowl Based Syst. 2022 Dec 22;258:109996. doi: 10.1016/j.knosys.2022.109996. Epub 2022 Oct 17.
Research on the correlation analysis between COVID-19 and air pollution has attracted increasing attention since the COVID-19 pandemic. While many relevant issues have been widely studied, research into ambient air pollutant concentration prediction (APCP) during COVID-19 is still in its infancy. Most of the existing study on APCP is based on machine learning methods, which are not suitable for APCP during COVID-19 due to the different distribution of historical observations before and after the pandemic. Therefore, to fulfill the predictive task based on the historical observations with a different distribution, this paper proposes an improved transfer learning model combined with machine learning for APCP during COVID-19. Specifically, this paper employs the Gaussian mixture method and an optimization algorithm to obtain a new source domain similar to the target domain for further transfer learning. Then, several commonly used machine learning models are trained in the new source domain, and these well-trained models are transferred to the target domain to obtain APCP results. Based on the real-world dataset, the experimental results suggest that, by using the improved machine learning methods based on transfer learning, our method can achieve the prediction with significantly high accuracy. In terms of managerial insights, the effects of influential factors are analyzed according to the relationship between these influential factors and prediction results, while their importance is ranked through their average marginal contribution and partial dependence plots.
自新冠疫情以来,关于新冠病毒与空气污染之间相关性分析的研究受到了越来越多的关注。虽然许多相关问题已得到广泛研究,但针对新冠疫情期间环境空气污染物浓度预测(APCP)的研究仍处于起步阶段。现有的大多数关于APCP的研究基于机器学习方法,由于疫情前后历史观测数据分布不同,这些方法并不适用于新冠疫情期间的APCP。因此,为了基于具有不同分布的历史观测数据完成预测任务,本文提出了一种结合机器学习的改进迁移学习模型,用于新冠疫情期间的APCP。具体而言,本文采用高斯混合方法和一种优化算法来获得一个与目标域相似的新源域,以便进行进一步的迁移学习。然后,在新源域中训练几种常用的机器学习模型,并将这些训练良好的模型迁移到目标域以获得APCP结果。基于真实世界数据集的实验结果表明,通过使用基于迁移学习的改进机器学习方法,我们的方法能够以显著更高的准确率实现预测。在管理见解方面,根据这些影响因素与预测结果之间的关系分析影响因素的作用,同时通过它们的平均边际贡献和偏依赖图对其重要性进行排序。