使用机器学习模型对西班牙巴塞罗那的 PM10 进行空间危害评估。

Spatial hazard assessment of the PM10 using machine learning models in Barcelona, Spain.

机构信息

Soil Conservation and Watershed Management Research Department, West Azarbaijan Agricultural and Natural Resources Research and Education Center, AREEO, Urmia, Iran.

Department of Reclamation of Arid and Mountainous Regions, Faculty of Natural Resources, University of Tehran, Karaj, Iran.

出版信息

Sci Total Environ. 2020 Jan 20;701:134474. doi: 10.1016/j.scitotenv.2019.134474. Epub 2019 Oct 4.

DOI:10.1016/j.scitotenv.2019.134474

PMID:31704408

Abstract

Air pollution, and especially atmospheric particulate matter (PM), has a profound impact on human mortality and morbidity, environment, and ecological system. Accordingly, it is very relevant predicting air quality. Although the application of the machine learning (ML) models for predicting air quality parameters, such as PM concentrations, has been evaluated in previous studies, those on the spatial hazard modeling of them are very limited. Due to the high potential of the ML models, the spatial modeling of PM can help managers to identify the pollution hotspots. Accordingly, this study aims at developing new ML models, such as Random Forest (RF), Bagged Classification and Regression Trees (Bagged CART), and Mixture Discriminate Analysis (MDA) for the hazard prediction of PM10 (particles with a diameter less than 10 µm) in the Barcelona Province, Spain. According to the annual PM10 concentration in 75 stations, the healthy and unhealthy locations are determined, and a ratio 70/30 (53/22 stations) is applied for calibrating and validating the ML models to predict the most hazardous areas for PM10. In order to identify the influential variables of PM modeling, the simulated annealing (SA) feature selection method is used. Seven features, among the thirteen features, are selected as critical features. According to the results, all the three-machine learning (ML) models achieve an excellent performance (Accuracy > 87% and precision > 86%). However, the Bagged CART and RF models have the same performance and higher than the MDA model. Spatial hazard maps predicted by the three models indicate that the high hazardous areas are located in the middle of the Barcelona Province more than in the Barcelona's Metropolitan Area.

摘要

空气污染，尤其是大气颗粒物（PM），对人类死亡率和发病率、环境和生态系统有着深远的影响。因此，对空气质量进行预测是非常相关的。尽管机器学习（ML）模型在预测空气质量参数，如 PM 浓度方面的应用已在以前的研究中进行了评估，但对其进行空间危害建模的研究却非常有限。由于 ML 模型具有很高的潜力，PM 的空间建模可以帮助管理者识别污染热点。因此，本研究旨在开发新的 ML 模型，如随机森林（RF）、袋装分类和回归树（Bagged CART）和混合判别分析（MDA），用于预测西班牙巴塞罗那省 PM10（直径小于 10 µm 的颗粒）的危害。根据 75 个站点的年 PM10 浓度，确定健康和不健康的位置，并应用 70/30（53/22 个站点）的比例来校准和验证 ML 模型，以预测 PM10 危害最大的区域。为了识别 PM 建模的影响变量，使用模拟退火（SA）特征选择方法。在 13 个特征中选择了 7 个特征作为关键特征。结果表明，所有三种机器学习（ML）模型都具有出色的性能（准确率>87%，精度>86%）。然而，Bagged CART 和 RF 模型的性能相同，且高于 MDA 模型。三个模型预测的空间危害图表明，高危害区域位于巴塞罗那省中部，而不是巴塞罗那大都市区。