TU Dresden, Dresden Database Research Group, Dresden, Germany.
Environ Monit Assess. 2023 Nov 18;195(12):1491. doi: 10.1007/s10661-023-11996-y.
Air pollution through particulate matter (PM) is one of the largest threats to human health. To understand the causes of PM pollution and enact suitable countermeasures, reliable predictions of future PM concentrations are required. In the scientific literature, many methods exist for machine learning (ML)-based PM prediction, though their quality is difficult to compare because, among other things, they use different data sets and evaluate the resulting predictions differently. For a new data set, it is not apparent which of the existing prediction methods is best suited. In order to ease the assessment of said models, we present evalPM, a framework to easily create, evaluate, and compare different ML models for immission-based PM prediction. To achieve this, the framework provides flexibility regarding data sets, input features, target variables, model types, hyperparameters, and model evaluation. It has a modular design consisting of several components, each providing at least one required flexibility. The individual capabilities of the framework are demonstrated using 16 different models from the related literature by means of temporal prediction of PM concentrations for four European data sets, showing the capabilities and advantages of the evalPM framework. In doing so, it is shown that the framework allows fast creation and evaluation of ML-based PM prediction models.
空气污染通过颗粒物(PM)是对人类健康的最大威胁之一。为了了解 PM 污染的原因并采取适当的对策,需要对未来 PM 浓度进行可靠的预测。在科学文献中,有许多基于机器学习(ML)的 PM 预测方法,但由于使用了不同的数据集和不同的评估方法,很难比较它们的质量。对于新的数据集,不清楚哪种现有的预测方法最适合。为了方便评估这些模型,我们提出了 evalPM,这是一个用于轻松创建、评估和比较基于排放的 PM 预测的不同 ML 模型的框架。为了实现这一目标,该框架在数据集、输入特征、目标变量、模型类型、超参数和模型评估方面提供了灵活性。它具有模块化设计,由几个组件组成,每个组件至少提供一种必需的灵活性。通过使用来自相关文献的 16 种不同模型,针对四个欧洲数据集对 PM 浓度进行时间预测,展示了 evalPM 框架的功能和优势,证明了该框架能够快速创建和评估基于机器学习的 PM 预测模型。