Suppr超能文献

用于细颗粒物污染制图的机器学习与地理空间分析的方法整合

Methodological Integration of Machine Learning and Geospatial Analysis for PM Pollution Mapping.

作者信息

Yasin Kalid Hassen, Yasin Muaz Ismael, Iguala Anteneh Derribew, Gelete Tadele Bedo, Kebede Erana

机构信息

Geo-Information Science Program, School of Geography and Environmental Studies, Haramaya University, P.O. Box 138, 3220 Dire Dawa, Ethiopia.

School of Medicine, College of Health and Medical Sciences, Haramaya University, P.O. Box 235, Harar, Ethiopia.

出版信息

MethodsX. 2025 Apr 17;14:103322. doi: 10.1016/j.mex.2025.103322. eCollection 2025 Jun.

Abstract

Air pollution mitigation necessitates accurate spatial modelling to inform public health interventions. Traditional approaches inadequately capture complex predictor-pollutant interactions, whereas machine learning (ML) offers a superior capacity for modelling nonlinear relationships. This study compares three ML Random Forest (RF), K-Nearest Neighbors (KNN), and Naïve Bayes (NB) algorithms using annual PM data from 11 monitoring stations alongside atmospheric, urban, and terrain covariates. The methodological framework employed rigorous preprocessing and cross-validation to classify pollution into three categorical levels. Results demonstrate RF superior performance, achieving 94% balanced accuracy and 97% specificity, significantly outperforming KNN (92%) and NB (89%). RF excelled in capturing spatial heterogeneity and complex variable interactions, while KNN and NB exhibited limitations in managing feature dependencies and localized variability. Despite computational demands, findings substantiate RF reliability for robust air quality monitoring applications. The study contributes valuable insights for implementing scalable pollution prediction systems in resource-constrained urban environments while acknowledging interpretability challenges inherent to complex ML models.•Preprocessing of spatial data from various sources, incorporating the handling of missing/abnormal data, analysis, and normalization•Implementation of the three ML algorithms with rigorous hyperparameter tuning, model validation, and performance assessment•Mapping PM Hotspots on the Gradient Direction and Distance from the City Center.

摘要

减轻空气污染需要精确的空间建模,以为公共卫生干预措施提供依据。传统方法无法充分捕捉复杂的预测因子与污染物之间的相互作用,而机器学习(ML)在建模非线性关系方面具有更强的能力。本研究使用来自11个监测站的年度PM数据以及大气、城市和地形协变量,比较了三种机器学习算法:随机森林(RF)、K近邻(KNN)和朴素贝叶斯(NB)。所采用的方法框架进行了严格的预处理和交叉验证,以将污染分为三个类别级别。结果表明RF具有卓越的性能,平衡准确率达到94%,特异性达到97%,显著优于KNN(92%)和NB(89%)。RF在捕捉空间异质性和复杂变量相互作用方面表现出色,而KNN和NB在处理特征依赖性和局部变异性方面存在局限性。尽管存在计算需求,但研究结果证实了RF在强大的空气质量监测应用中的可靠性。该研究为在资源有限的城市环境中实施可扩展的污染预测系统提供了有价值的见解,同时也认识到复杂机器学习模型固有的可解释性挑战。

•对来自各种来源的空间数据进行预处理,包括处理缺失/异常数据、分析和归一化

•通过严格的超参数调整、模型验证和性能评估来实现三种机器学习算法

•在梯度方向和距市中心距离上绘制PM热点图

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ae/12051153/14e55dbd8eb6/ga1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验