Zrieq Rafat, Kamel Souad, Al-Hamazani Faris, Boubaker Sahbi, Attili Rozan, Araúzo-Bravo Marcos J
Department of Public Health, College of Public Health and Health Informatics, University of Ha'il, Ha'il 55471, Saudi Arabia.
Applied Science Research Center, Applied Science Private University, Amman 11937, Jordan.
Toxics. 2025 Aug 16;13(8):682. doi: 10.3390/toxics13080682.
Air pollution is steadily increasing due to industrialization, economic activities, and transportation. High levels pose a significant threat to human health and well-being worldwide. Saudi Arabia is a growing country with air quality indices ranging from moderate to unhealthy. Although there are many monitoring stations distributed throughout the country, mathematical modeling of air pollution is still crucial for health and environmental decision-making. From this perspective, in this study, a data-driven approach based on pollutant records and a Deep Learning (DL) Long Short-Term Memory (LSTM) algorithm is carried out to perform temporal modeling of selected pollutants (PM, PM, CO and O) based on time series combined with a spatial modeling focused on selected cities (Riyadh, Jeddah, Mecca, Rabigh, Abha, Dammam and Taif), covering ~48% of the total population of the country. The best forecasts were provided by LSTM in cases where the datasets used were of relatively large size. Numerically, the obtained performance metrics such as the coefficient of determination (R) ranged from 0.2425 to 0.8073. The best LSTM results were compared to those provided by two ensemble methods, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost), where the merits of LSTM were confirmed mainly in terms of its ability to capture hidden relationships. We also found that overall, meteorological factors showed a weak association with pollutant concentrations, with ambient temperature exerting a moderate influence. However, incorporating ambient temperature into LSTM models did not lead to a significant improvement in predictive accuracy. The developed approach can be used to support decision-making in environmental and health domains, as well as to monitor pollutant concentrations based on historical time series records.
由于工业化、经济活动和交通运输,空气污染正在稳步加剧。高污染水平对全球人类健康和福祉构成重大威胁。沙特阿拉伯是一个发展中国家,空气质量指数从中度到不健康不等。尽管该国各地分布着许多监测站,但空气污染的数学模型对于健康和环境决策仍然至关重要。从这个角度来看,在本研究中,开展了一种基于污染物记录和深度学习(DL)长短期记忆(LSTM)算法的数据驱动方法,以基于时间序列对选定污染物(PM、PM、CO和O)进行时间建模,并结合针对选定城市(利雅得、吉达、麦加、拉比格、艾卜哈、达曼和塔伊夫)的空间建模,覆盖该国约48%的总人口。在使用的数据集规模相对较大的情况下,LSTM提供了最佳预测。在数值上,获得的性能指标如决定系数(R)范围为0.2425至0.8073。将最佳LSTM结果与两种集成方法(随机森林(RF)和极端梯度提升(XGBoost))提供的结果进行了比较,其中LSTM的优点主要在其捕捉隐藏关系的能力方面得到了证实。我们还发现,总体而言,气象因素与污染物浓度的关联较弱,环境温度有一定影响。然而,将环境温度纳入LSTM模型并未导致预测准确性的显著提高。所开发的方法可用于支持环境和健康领域的决策,以及基于历史时间序列记录监测污染物浓度。