Zhalehdoost Alireza, Taleai Mohammad
GIS Department, Faculty of Geodesy & Geomatics Engineering, K. N. Toosi University of Technology, P.O.Box 16315-1355, Tehran, Iran.
Geospatial Research Innovation and Development (GRID), School of Built Environment (BE), University of New South Wales (UNSW), Sydney , Australia.
Sci Rep. 2025 Jul 29;15(1):27708. doi: 10.1038/s41598-025-13639-3.
Urban air pollution poses a major threat to public health and environmental sustainability. This study proposes a structured machine learning (ML)-based framework to examine how temporal and spatial resolution choices affect the accuracy of urban air pollution modeling. The research is conducted in two distinct phases. In the temporal phase, the impact of incorporating pollutant autocorrelation into ML models (Multilayer Perceptron (MLP) and Random Forest (RF)) is analyzed by comparing results with and without temporal lag features derived through autoregressive (AR) modeling. In the spatial phase, emission inventory data are aggregated at three spatial resolutions (500 m, 750 m, and 1000 m) to evaluate their effect on model performance in predicting PM and NOx concentrations. Results from the temporal modeling phase indicate that including lag features significantly improves PM predictions: RMSE for PM₁₀ is reduced by 25.9% (from 92.56 µg/m to 68.59 µg/m), and for PM by 38.9% (from 61.10 µg/m to 37.30 µg/m). Conversely, for NOx, RMSE increases by 53.2% (from 7.90 µg/m to 12.10 µg/m), indicating pollutant-specific temporal behavior. In spatial modeling, a coarser resolution (1000 m) yields better performance for PM (RMSE = 13.51 kg/year), while a finer resolution (500 m) is more effective for NOx (RMSE = 307.50 kg/year). Among the evaluated algorithms, MLP consistently achieves the highest predictive accuracy across both temporal and spatial scenarios. These findings underscore the importance of selecting appropriate temporal and spatial resolutions tailored to each pollutant type. The proposed framework offers a flexible, resolution-aware modeling strategy that can support more effective urban air quality management policies.
城市空气污染对公众健康和环境可持续性构成重大威胁。本研究提出了一个基于结构化机器学习(ML)的框架,以研究时间和空间分辨率的选择如何影响城市空气污染建模的准确性。该研究分两个不同阶段进行。在时间阶段,通过比较包含和不包含通过自回归(AR)建模得出的时间滞后特征的结果,分析将污染物自相关纳入ML模型(多层感知器(MLP)和随机森林(RF))的影响。在空间阶段,排放清单数据在三种空间分辨率(500米、750米和1000米)下进行汇总,以评估它们对预测PM和NOx浓度的模型性能的影响。时间建模阶段的结果表明,纳入滞后特征显著提高了PM预测:PM₁₀的均方根误差(RMSE)降低了25.9%(从92.56微克/立方米降至68.59微克/立方米),PM的RMSE降低了38.9%(从61.10微克/立方米降至37.30微克/立方米)。相反,对于NOx,RMSE增加了53.2%(从7.90微克/立方米增至12.10微克/立方米),表明污染物具有特定的时间行为。在空间建模中,较粗的分辨率(1000米)对PM产生更好的性能(RMSE = 13.51千克/年),而较细的分辨率(500米)对NOx更有效(RMSE = 307.50千克/年)。在评估的算法中,MLP在时间和空间两种情况下始终实现最高的预测准确性。这些发现强调了针对每种污染物类型选择合适的时间和空间分辨率的重要性。所提出的框架提供了一种灵活的、分辨率感知的建模策略,可支持更有效的城市空气质量管理政策。