Suppr超能文献

提升土耳其伊斯坦布尔的每小时 PM 预测:在比较机器学习模型分析中利用 ERA5 再分析和遗传算法。

Elevating hourly PM forecasting in Istanbul, Türkiye: Leveraging ERA5 reanalysis and genetic algorithms in a comparative machine learning model analysis.

机构信息

Department of Computer Technologies, Bergama Vocational School, Dokuz Eylul University, Bergama, Izmir, 35700, Türkiye.

Department of Environmental Engineering, Faculty of Engineering, Dokuz Eylul University, Buca, Izmir, 35390, Türkiye; Dokuz Eylul University, Environmental Research and Application Center (ÇEVMER), Tinaztepe Campus, 35390, Buca, Izmir, Türkiye.

出版信息

Chemosphere. 2024 Sep;364:143096. doi: 10.1016/j.chemosphere.2024.143096. Epub 2024 Aug 13.

Abstract

Rapid urbanization and industrialization have intensified air pollution, posing severe health risks and necessitating accurate PM predictions for effective urban air quality management. This study distinguishes itself by utilizing high-resolution ERA5 reanalysis data for a grid-based spatial analysis of Istanbul, Türkiye, a densely populated city with diverse pollutant sources. It assesses the predictive accuracy of advanced machine learning (ML) models-Multiple Linear Regression (MLR), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting (LGB), Random Forest (RF), and Nonlinear Autoregressive with Exogenous Inputs (NARX). Notably, it introduces genetic algorithm optimization for the NARX model to enhance its performance. The models were trained on hourly PM concentrations from twenty monitoring stations across 2020-2021. Istanbul was divided into seven regions based on ERA5 grid distributions to examine PM spatial variability. Seventeen input variables from ERA5, including meteorological, land cover, and vegetation parameters, were analyzed using the Neighborhood Component Analysis (NCA) method to identify the most predictive variables. Comparative analysis showed that while all models provided valuable insights (RF > LGB > XGB > MLR), the NARX model outperformed them, particularly with the complex dataset used. The NARX model achieved a high R-value (0.89), low RMSE (5.24 μg/m³), and low MAE (2.94 μg/m³). It performed best in autumn and winter, with the highest accuracy in Region-1 (R-value 0.94) and the lowest in Region-5 (R-value 0.75). This study's success in a complex urban setting with limited monitoring underscores the robustness of the NARX model and the methodology's potential for global application in similar urban contexts. By addressing temporal and spatial variability in air quality predictions, this research sets a new benchmark and highlights the importance of advanced data analysis techniques for developing targeted pollution control strategies and public health policies.

摘要

快速的城市化和工业化进程加剧了空气污染,对人类健康构成了严重威胁,因此需要准确预测 PM2.5 以进行有效的城市空气质量管理。本研究利用高分辨率 ERA5 再分析数据,对土耳其伊斯坦布尔进行了基于网格的空间分析。伊斯坦布尔是一个人口密集的城市,拥有多样化的污染源。该研究区分了不同的污染物来源。该研究评估了先进的机器学习 (ML) 模型的预测精度,包括多元线性回归 (MLR)、极端梯度提升 (XGBoost)、轻梯度提升 (LGB)、随机森林 (RF) 和具有外生输入的非线性自回归 (NARX)。值得注意的是,本研究还引入了遗传算法优化 NARX 模型,以提高其性能。模型使用 2020-2021 年来自 20 个监测站的每小时 PM2.5 浓度进行训练。根据 ERA5 网格分布,将伊斯坦布尔分为七个区域,以检查 PM2.5 的空间变化。使用邻域成分分析 (NCA) 方法对 ERA5 中的 17 个输入变量(包括气象、土地覆盖和植被参数)进行了分析,以确定最具预测性的变量。比较分析表明,虽然所有模型都提供了有价值的见解(RF > LGB > XGB > MLR),但 NARX 模型表现优于其他模型,特别是在使用复杂数据集时。NARX 模型的 R 值较高 (0.89),RMSE 较低 (5.24 μg/m³),MAE 较低 (2.94 μg/m³)。它在秋季和冬季表现最好,在区域 1 中的精度最高 (R 值为 0.94),在区域 5 中的精度最低 (R 值为 0.75)。本研究在复杂的城市环境中取得的成功,证明了 NARX 模型的稳健性,以及该方法在类似城市环境中进行全球应用的潜力。通过解决空气质量预测中的时空变化,本研究为制定有针对性的污染控制策略和公共卫生政策提供了新的基准,并强调了先进数据分析技术的重要性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验