Ghadi Yazeed Yasin, Saqib Sheikh Muhammad, Mazhar Tehseen, Almogren Ahmad, Waheed Wajahat, Altameem Ayman, Hamam Habib
Department of Computer Science and Software Engineering, Al Ain University, 12555, Abu Dhabi, United Arab Emirates.
Department of Computing and Information Technology, Gomal University, Dera Ismail Khan, 29050, Pakistan.
Sci Rep. 2025 Mar 8;15(1):8070. doi: 10.1038/s41598-025-92788-x.
Smog poses a direct threat to human health and the environment. Addressing this issue requires understanding how smog is formed. While major contributors include industries, fossil fuels, crop burning, and ammonia from fertilizers, vehicles play a significant role. Individually, a vehicle's contribution to smog may be small, but collectively, the vast number of vehicles has a substantial impact. Manually assessing the contribution of each vehicle to smog is impractical. However, advancements in machine learning make it possible to quantify this contribution. By creating a dataset with features such as vehicle model, year, fuel consumption (city), and fuel type, a predictive model can classify vehicles based on their smog impact, rating them on a scale from 1 (poor) to 8 (excellent). This study proposes a novel approach using Random Forest and Explainable Boosting Classifier models, along with SMOTE (Synthetic Minority Oversampling Technique), to predict the smog contribution of individual vehicles. The results outperform previous studies, with the proposed model achieving an accuracy of 86%. Key performance metrics include a Mean Squared Error of 0.2269, R-Squared (R) of 0.9624, Mean Absolute Error of 0.2104, Explained Variance Score of 0.9625, and a Max Error of 4.3500. These results incorporate explainable AI techniques, using both agnostic and specific models, to provide clear and actionable insights. This work represents a significant step forward, as the dataset was last updated only five months ago, underscoring the timeliness and relevance of the research.
雾霾对人类健康和环境构成直接威胁。解决这一问题需要了解雾霾是如何形成的。虽然主要成因包括工业、化石燃料、作物焚烧以及肥料中的氨,但车辆也起着重要作用。就单个车辆而言,其对雾霾的贡献可能较小,但总体而言,大量车辆产生的影响巨大。手动评估每辆车对雾霾的贡献是不切实际的。然而,机器学习的进步使量化这种贡献成为可能。通过创建一个包含车辆型号、年份、(城市)油耗和燃料类型等特征的数据集,一个预测模型可以根据车辆对雾霾的影响对其进行分类,在从1(差)到8(优)的范围内对它们进行评级。本研究提出了一种使用随机森林和可解释增强分类器模型以及SMOTE(合成少数过采样技术)的新方法,以预测单个车辆对雾霾的贡献。结果优于先前的研究,所提出的模型准确率达到86%。关键性能指标包括均方误差0.2269、决定系数(R)0.9624、平均绝对误差0.2104、解释方差得分0.9625和最大误差4.3500。这些结果采用了可解释的人工智能技术,使用了不可知模型和特定模型,以提供清晰且可操作的见解。这项工作代表了向前迈出的重要一步,因为该数据集仅在五个月前进行了更新,凸显了研究的及时性和相关性。