Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Helsinki Institute of Sustainability Science, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland.
Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain; Department of Mining, Industrial and ICT Engineering (EMIT), Manresa School of Engineering (EPSEM), Universitat Politècnica de Catalunya (UPC), Manresa 08242, Spain.
Environ Int. 2024 Feb;184:108449. doi: 10.1016/j.envint.2024.108449. Epub 2024 Jan 22.
Black carbon (BC) has received increasing attention from researchers due to its adverse health effects. However, in-situ BC measurements are often not included as a regulated variable in air quality monitoring networks. Machine learning (ML) models have been studied extensively to serve as virtual sensors to complement the reference instruments. This study evaluates and compares three white-box (WB) and four black-box (BB) ML models to estimate BC concentrations, with the focus to show their transferability and interpretability. We train the models with the long-term air pollutant and weather measurements in Barcelona urban background site, and test them in other European urban and traffic sites. Despite the difference in geographical locations and measurement sites, BC correlates the strongest with particle number concentration of accumulation mode (PN, r = 0.73-0.85) and nitrogen dioxide (NO, r = 0.68-0.85) and the weakest with meteorological parameters. Due to its similarity of correlation behaviour, the ML models trained in Barcelona performs prominently at the traffic site in Helsinki (R = 0.80-0.86; mean absolute error MAE = 3.90-4.73 %) and at the urban background site in Dresden (R = 0.79-0.84; MAE = 4.23-4.82 %). WB models appear to explain less variability of BC than BB models, long short-term memory (LSTM) model of which outperforms the rest of the models. In terms of interpretability, we adopt several methods for individual model to quantify and normalize the relative importance of each input feature. The overall static relative importance commonly used for WB models demonstrate varying results from the dynamic values utilized to show local contribution used for BB models. PN and NO on average have the strongest absolute static contribution; however, they simultaneously impact the estimation positively and negatively at different sites. This comprehensive analysis demonstrates that the possibility of these interpretable air pollutant ML models to be transfered across space and time.
黑碳 (BC) 因其对健康的不利影响而受到研究人员的越来越多的关注。然而,在现场测量 BC 通常不作为空气质量监测网络中的规定变量。机器学习 (ML) 模型已被广泛研究,用作虚拟传感器以补充参考仪器。本研究评估和比较了三种白盒 (WB) 和四种黑盒 (BB) ML 模型来估计 BC 浓度,重点是展示它们的可转移性和可解释性。我们使用巴塞罗那城市背景站点的长期空气污染物和气象测量数据来训练模型,并在其他欧洲城市和交通站点进行测试。尽管地理位置和测量站点不同,但 BC 与积累模态的粒子数浓度 (PN,r=0.73-0.85) 和二氧化氮 (NO,r=0.68-0.85) 的相关性最强,与气象参数的相关性最弱。由于相关性行为相似,在赫尔辛基的交通站点(R=0.80-0.86;平均绝对误差 MAE=3.90-4.73%)和德累斯顿的城市背景站点(R=0.79-0.84;MAE=4.23-4.82%),在巴塞罗那训练的 ML 模型表现出色。WB 模型似乎比 BB 模型解释的 BC 可变性更小,其中长短期记忆 (LSTM) 模型的表现优于其他模型。在可解释性方面,我们采用了几种方法来对每个输入特征的相对重要性进行量化和归一化。总体静态相对重要性通常用于 WB 模型,其结果与用于显示 BB 模型局部贡献的动态值不同。PN 和 NO 平均具有最强的绝对静态贡献;然而,它们在不同的站点同时对估计产生积极和消极的影响。这项综合分析表明,这些可解释的空气污染物 ML 模型在空间和时间上具有转移的可能性。