Suppr超能文献

混合增量集成学习在噪声实际数据分类中的应用。

Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification.

出版信息

IEEE Trans Cybern. 2019 Feb;49(2):403-416. doi: 10.1109/TCYB.2017.2774266. Epub 2017 Dec 4.

Abstract

Traditional ensemble learning approaches explore the feature space and the sample space, respectively, which will prevent them to construct more powerful learning models for noisy real-world dataset classification. The random subspace method only search for the selection of features. Meanwhile, the bagging approach only search for the selection of samples. To overcome these limitations, we propose the hybrid incremental ensemble learning (HIEL) approach which takes into consideration the feature space and the sample space simultaneously to handle noisy dataset. Specifically, HIEL first adopts the bagging technique and linear discriminant analysis to remove noisy attributes, and generates a set of bootstraps and the corresponding ensemble members in the subspaces. Then, the classifiers are selected incrementally based on a classifier-specific criterion function and an ensemble criterion function. The corresponding weights for the classifiers are assigned during the same process. Finally, the final label is summarized by a weighted voting scheme, which serves as the final result of the classification. We also explore various classifier-specific criterion functions based on different newly proposed similarity measures, which will alleviate the effect of noisy samples on the distance functions. In addition, the computational cost of HIEL is analyzed theoretically. A set of nonparametric tests are adopted to compare HIEL and other algorithms over several datasets. The experiment results show that HIEL performs well on the noisy datasets. HIEL outperforms most of the compared classifier ensemble methods on 14 out of 24 noisy real-world UCI and KEEL datasets.

摘要

传统的集成学习方法分别探索特征空间和样本空间,这将阻止它们为嘈杂的现实世界数据集分类构建更强大的学习模型。随机子空间方法仅搜索特征的选择。同时,装袋方法仅搜索样本的选择。为了克服这些限制,我们提出了混合增量集成学习 (HIEL) 方法,该方法同时考虑特征空间和样本空间,以处理嘈杂的数据集。具体来说,HIEL 首先采用装袋技术和线性判别分析来去除嘈杂属性,并在子空间中生成一组引导和相应的集成成员。然后,根据特定于分类器的准则函数和集成准则函数,逐步选择分类器。在同一过程中,为分类器分配相应的权重。最后,通过加权投票方案总结最终标签,作为分类的最终结果。我们还探索了各种基于新提出的相似性度量的特定于分类器的准则函数,这将减轻嘈杂样本对距离函数的影响。此外,还从理论上分析了 HIEL 的计算成本。采用了一组非参数检验来比较 HIEL 和其他算法在多个数据集上的性能。实验结果表明,HIEL 在嘈杂数据集上表现良好。在 24 个嘈杂的 UCI 和 KEEL 数据集的 14 个数据集上,HIEL 优于大多数比较的分类器集成方法。

相似文献

2
Hybrid adaptive classifier ensemble.混合自适应分类器集成。
IEEE Trans Cybern. 2015 Feb;45(2):177-90. doi: 10.1109/TCYB.2014.2322195. Epub 2014 May 20.
3
Progressive Semisupervised Learning of Multiple Classifiers.多分类器的渐进式半监督学习。
IEEE Trans Cybern. 2018 Feb;48(2):689-702. doi: 10.1109/TCYB.2017.2651114. Epub 2017 Jan 19.
4
Hybrid k -Nearest Neighbor Classifier.混合 k-最近邻分类器。
IEEE Trans Cybern. 2016 Jun;46(6):1263-75. doi: 10.1109/TCYB.2015.2443857. Epub 2015 Jun 26.
5
Multiobjective Semisupervised Classifier Ensemble.多目标半监督分类器集成。
IEEE Trans Cybern. 2019 Jun;49(6):2280-2293. doi: 10.1109/TCYB.2018.2824299. Epub 2018 Apr 20.
10
Adaptive Fusion Based Method for Imbalanced Data Classification.基于自适应融合的不平衡数据分类方法
Front Neurorobot. 2022 Feb 28;16:827913. doi: 10.3389/fnbot.2022.827913. eCollection 2022.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验