Gao Xintong, Wang Xiaohong, Li Fuping, Jiang Wenhao, Zhe Meng, Sun Jiaxing, Zhang Ao, Jiao Linlin
College of Mining Engineering, North China University of Science and Technology, No. 21 Bohai Avenue, Caofeidian District, Tangshan, 063210, Hebei, China.
Hebei Industrial Technology Institute of Mine Ecological Remediation, Tangshan, 063210, Hebei, China.
Sci Rep. 2025 Jul 1;15(1):20731. doi: 10.1038/s41598-025-07719-7.
High-precision prediction of near-surface PM concentration is a significant theoretical prerequisite for effective monitoring and prevention of air pollution, and also provides guiding suggestions for the prevention and control of PM-related health risks. It has been acknowledged that existing PM prediction models predominantly rely on variables influenced by near-surface factors. This inherent limitation could hinder the comprehensive exploration of the continuous spatio-temporal characteristics associated with PM. In this study, an optimal 7-day prediction model for PM concentration based on the Stacking algorithm was constructed based on multi-source data mainly including atmospheric environment ground monitoring station data, MODIS remote sensing-derived aerosol optical depth (AOD) daily data and meteorological factors. The findings indicated that the PM forecasting outcomes derived from this integrated RF-LSTM-Stacking model exhibited a superior fit, with R², RMSE, and MAE values of 0.95, 7.74 µg/m³, and 6.08 µg/m³, correspondingly. This approach enhanced the accuracy of prediction to a degree of approximately 17% in comparison with a solitary machine learning model. The findings of this study demonstrated that the integration of the LSTM-RF model with the fusion-based Stacking algorithm led to a substantial enhancement in the accuracy of PM predictions. This model was found to serve as an effective reference for the monitoring of PM prediction and early warning systems.
高精度预测近地面颗粒物(PM)浓度是有效监测和防治空气污染的重要理论前提,也为防控与PM相关的健康风险提供指导建议。人们已经认识到,现有的PM预测模型主要依赖受近地面因素影响的变量。这种固有局限性可能会阻碍对与PM相关的连续时空特征的全面探索。在本研究中,基于主要包括大气环境地面监测站数据、MODIS遥感反演的气溶胶光学厚度(AOD)日数据和气象因素在内的多源数据,构建了基于Stacking算法的最优7天PM浓度预测模型。研究结果表明,由这种集成的随机森林-长短期记忆网络-Stacking模型得出的PM预测结果表现出更好的拟合度,相应的决定系数(R²)、均方根误差(RMSE)和平均绝对误差(MAE)值分别为0.95、7.74微克/立方米和6.08微克/立方米。与单一机器学习模型相比,这种方法将预测准确率提高了约17%。本研究结果表明,长短期记忆网络-随机森林模型与基于融合的Stacking算法相结合,显著提高了PM预测的准确性。该模型可作为PM预测监测和预警系统的有效参考。