School of Mathematics and Computer Science, Yan'an University, Yan'an, 716000, China.
Environ Sci Pollut Res Int. 2023 Sep;30(42):96562-96574. doi: 10.1007/s11356-023-29336-5. Epub 2023 Aug 14.
Air pollution is an increasingly serious problem. Accurate and efficient prediction of air quality can effectively prevent air pollution and improve the quality of human life. The air quality index (AQI) is a dimensionless tool to describe air quality quantitatively. In this study, the machine learning (ML) method was used to estimate AQI for Shijiazhuang, China, as the research object, and pollutants and meteorological factors as data models. Specifically, eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Random Forest (RF) models were used. The experimental results show that XGBoost model captures the AQI variation trend well, and the R of XGBoost model is 0.929, which is 0.3% and 2.3% higher than the R of RF model and LightGBM model, respectively. In addition, through the SHAP-based model interpretation method, the study reveals the key factors of AQI variation, that is PM and PM, play positive roles in the variation of AQI and AQI is less sensitive to meteorological factors. Finally, Beijing, Shanghai, Xi'an, and Guangzhou were selected to test the model's validity, and the model performance remained good. Our study shows that applying ML approach to air quality prediction is beneficial for efficiently assessing cities' future air quality.
空气污染是一个日益严重的问题。准确、高效地预测空气质量可以有效地预防空气污染,提高人类生活质量。空气质量指数(AQI)是一种无量纲工具,用于定量描述空气质量。在这项研究中,以中国石家庄市为研究对象,采用机器学习(ML)方法,以污染物和气象因素为数据模型,对空气质量指数(AQI)进行了估算。具体来说,使用了极端梯度提升(XGBoost)、轻梯度提升机(LightGBM)和随机森林(RF)模型。实验结果表明,XGBoost 模型很好地捕捉到了 AQI 的变化趋势,其 R 值为 0.929,分别比 RF 模型和 LightGBM 模型的 R 值高 0.3%和 2.3%。此外,通过基于 SHAP 的模型解释方法,研究揭示了 AQI 变化的关键因素,即 PM 和 PM 对 AQI 的变化起着积极的作用,而 AQI 对气象因素的敏感性较低。最后,选择了北京、上海、西安和广州进行模型有效性测试,模型性能仍然良好。我们的研究表明,应用 ML 方法进行空气质量预测有利于高效评估城市未来的空气质量。