Suppr超能文献

一种基于相关引导聚类和粒子群优化的高维数据快速混合特征选择方法

A Fast Hybrid Feature Selection Based on Correlation-Guided Clustering and Particle Swarm Optimization for High-Dimensional Data.

作者信息

Song Xian-Fang, Zhang Yong, Gong Dun-Wei, Gao Xiao-Zhi

出版信息

IEEE Trans Cybern. 2022 Sep;52(9):9573-9586. doi: 10.1109/TCYB.2021.3061152. Epub 2022 Aug 18.

Abstract

The "curse of dimensionality" and the high computational cost have still limited the application of the evolutionary algorithm in high-dimensional feature selection (FS) problems. This article proposes a new three-phase hybrid FS algorithm based on correlation-guided clustering and particle swarm optimization (PSO) (HFS-C-P) to tackle the above two problems at the same time. To this end, three kinds of FS methods are effectively integrated into the proposed algorithm based on their respective advantages. In the first and second phases, a filter FS method and a feature clustering-based method with low computational cost are designed to reduce the search space used by the third phase. After that, the third phase applies oneself to finding an optimal feature subset by using an evolutionary algorithm with the global searchability. Moreover, a symmetric uncertainty-based feature deletion method, a fast correlation-guided feature clustering strategy, and an improved integer PSO are developed to improve the performance of the three phases, respectively. Finally, the proposed algorithm is validated on 18 publicly available real-world datasets in comparison with nine FS algorithms. Experimental results show that the proposed algorithm can obtain a good feature subset with the lowest computational cost.

摘要

“维度诅咒”和高昂的计算成本仍然限制了进化算法在高维特征选择(FS)问题中的应用。本文提出了一种基于相关性引导聚类和粒子群优化(PSO)的新型三相混合FS算法(HFS-C-P),以同时解决上述两个问题。为此,基于三种FS方法各自的优势,将它们有效地集成到所提出的算法中。在第一阶段和第二阶段,设计了一种计算成本较低的过滤FS方法和一种基于特征聚类的方法,以减少第三阶段使用的搜索空间。之后,第三阶段致力于通过使用具有全局搜索能力的进化算法来找到最优特征子集。此外,还分别开发了一种基于对称不确定性的特征删除方法、一种快速相关性引导的特征聚类策略和一种改进的整数PSO,以提高三个阶段的性能。最后,与九种FS算法相比,在所提出的算法在18个公开可用的真实世界数据集上进行了验证。实验结果表明,所提出的算法能够以最低的计算成本获得良好的特征子集。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验