Department of Automation, Yancheng Institute of Technology, Yancheng 224051, China.
The Wolfson Centre for Bulk Solids Handling Technology, Faculty of Engineering & Science, University of Greenwich, Kent ME4 4TB, UK.
Sci Total Environ. 2019 Feb 15;651(Pt 2):3043-3052. doi: 10.1016/j.scitotenv.2018.10.193. Epub 2018 Oct 15.
Hourly PM concentrations have multiple change patterns. For hourly PM concentration prediction, it is beneficial to split the whole dataset into several subsets with similar properties and to train a local prediction model for each subset. However, the methods based on local models need to solve the global-local duality. In this study, a novel prediction model based on classification and regression tree (CART) and ensemble extreme learning machine (EELM) methods is developed to split the dataset into subsets in a hierarchical fashion and build a prediction model for each leaf. Firstly, CART is used to split the dataset by constructing a shallow hierarchical regression tree. Then at each node of the tree, EELM models are built using the training samples of the node, and hidden neuron numbers are selected to minimize validation errors respectively on the leaves of a sub-tree that takes the node as the root. Finally, for each leaf of the tree, a global and several local EELMs on the path from the root to the leaf are compared, and the one with the smallest validation error on the leaf is chosen. The meteorological data of Yancheng urban area and the air pollutant concentration data from City Monitoring Centre are used to evaluate the method developed. The experimental results demonstrate that the method developed addresses the global-local duality, having better performance than global models including random forest (RF), v-support vector regression (v-SVR) and EELM, and other local models based on season and k-means clustering. The new model has improved the capability of treating multiple change patterns.
每小时 PM 浓度具有多种变化模式。对于每小时 PM 浓度预测,将整个数据集划分为具有相似性质的几个子集,并为每个子集训练局部预测模型是有益的。然而,基于局部模型的方法需要解决全局-局部对偶性。在本研究中,开发了一种基于分类回归树 (CART) 和集成极限学习机 (EELM) 方法的新型预测模型,用于分层地将数据集划分为子集,并为每个叶子构建预测模型。首先,使用 CART 通过构建浅层层次回归树来分割数据集。然后,在树的每个节点处,使用节点的训练样本构建 EELM 模型,并分别选择隐藏神经元数量,以最小化子树的叶子上的验证误差,该子树以该节点为根。最后,对于树的每个叶子,比较从根到叶子的路径上的全局和几个局部 EELM,并选择在叶子上验证误差最小的那个。使用盐城市区气象数据和城市监测中心的空气污染物浓度数据来评估所开发的方法。实验结果表明,所开发的方法解决了全局-局部对偶性问题,性能优于包括随机森林 (RF)、v-支持向量回归 (v-SVR) 和 EELM 在内的全局模型,以及基于季节和 k-均值聚类的其他局部模型。新模型提高了处理多种变化模式的能力。