Laboratory of Water, Energy and Environment (Lab 3E), Sfax National School of Engineering, University of Sfax, Sfax, Tunisia.
LIRIS, UMR 5205, University of Lyon 1, Villeurbanne, France.
Sci Total Environ. 2020 May 1;715:136991. doi: 10.1016/j.scitotenv.2020.136991. Epub 2020 Jan 28.
Air pollution is considered one of the biggest threats for the ecological system and human existence. Therefore, air quality monitoring has become a necessity in urban and industrial areas. Recently, the emergence of Machine Learning techniques justifies the application of statistical approaches for environmental modeling, especially in air quality forecasting. In this context, we propose a novel feature ranking method, termed as Ensemble of Regressor Chains-guided Feature Ranking (ERCFR) to forecast multiple air pollutants simultaneously over two cities. This approach is based on a combination of one of the most powerful ensemble methods for Multi-Target Regression problems (Ensemble of Regressor Chains) and the Random Forest permutation importance measure. Thus, feature selection allowed the model to obtain the best results with a restricted subset of features. The experimental results reveal the superiority of the proposed approach compared to other state-of-the-art methods, although some cautions have to be considered to improve the runtime performance and to decrease its sensitivity over extreme and outlier values.
空气污染被认为是对生态系统和人类生存的最大威胁之一。因此,空气质量监测已成为城市和工业区的必要措施。最近,机器学习技术的出现证明了统计方法在环境建模中的应用是合理的,特别是在空气质量预测方面。在这种情况下,我们提出了一种新的特征排序方法,称为基于回归链集成引导特征排序的方法(简称 ERCFR),用于同时预测两个城市的多种空气污染物。该方法基于多目标回归问题中最强大的集成方法之一(回归链集成)和随机森林排列重要性度量的组合。因此,特征选择允许模型使用特征的受限子集获得最佳结果。实验结果表明,与其他最先进的方法相比,所提出的方法具有优越性,尽管需要注意一些事项来提高运行时性能并降低其对极端值和异常值的敏感性。