PSO 中基于离散化的特征选择的新表示。

A New Representation in PSO for Discretization-Based Feature Selection.

出版信息

IEEE Trans Cybern. 2018 Jun;48(6):1733-1746. doi: 10.1109/TCYB.2017.2714145. Epub 2017 Jun 23.

DOI:10.1109/TCYB.2017.2714145

Abstract

In machine learning, discretization and feature selection (FS) are important techniques for preprocessing data to improve the performance of an algorithm on high-dimensional data. Since many FS methods require discrete data, a common practice is to apply discretization before FS. In addition, for the sake of efficiency, features are usually discretized individually (or univariate). This scheme works based on the assumption that each feature independently influences the task, which may not hold in cases where feature interactions exist. Therefore, univariate discretization may degrade the performance of the FS stage since information showing feature interactions may be lost during the discretization process. Initial results of our previous proposed method [evolve particle swarm optimization (EPSO)] showed that combining discretization and FS in a single stage using bare-bones particle swarm optimization (BBPSO) can lead to a better performance than applying them in two separate stages. In this paper, we propose a new method called potential particle swarm optimization (PPSO) which employs a new representation that can reduce the search space of the problem and a new fitness function to better evaluate candidate solutions to guide the search. The results on ten high-dimensional datasets show that PPSO select less than 5% of the number of features for all datasets. Compared with the two-stage approach which uses BBPSO for FS on the discretized data, PPSO achieves significantly higher accuracy on seven datasets. In addition, PPSO obtains better (or similar) classification performance than EPSO on eight datasets with a smaller number of selected features on six datasets. Furthermore, PPSO also outperforms the three compared (traditional) methods and performs similar to one method on most datasets in terms of both generalization ability and learning capacity.

摘要

在机器学习中，离散化和特征选择（FS）是预处理数据以提高算法在高维数据上性能的重要技术。由于许多 FS 方法需要离散数据，因此通常在 FS 之前应用离散化。此外，为了提高效率，特征通常是单独（或单变量）离散化的。这种方案基于每个特征独立影响任务的假设，而在存在特征交互的情况下，这种假设可能不成立。因此，由于在离散化过程中可能会丢失显示特征交互的信息，因此单变量离散化可能会降低 FS 阶段的性能。我们之前提出的方法[进化粒子群优化（EPSO）]的初步结果表明，使用基本粒子群优化（BBPSO）在单个阶段中结合离散化和 FS 可以比在两个单独阶段中应用它们获得更好的性能。在本文中，我们提出了一种新方法，称为潜在粒子群优化（PPSO），它采用了一种新的表示形式，可以减少问题的搜索空间，并采用了新的适应度函数来更好地评估候选解决方案，以指导搜索。在十个高维数据集上的结果表明，PPSO 为所有数据集选择的特征数量不到 5%。与使用 BBPSO 在离散化数据上进行 FS 的两阶段方法相比，PPSO 在七个数据集上实现了显著更高的准确性。此外，PPSO 在八个数据集上获得了比 EPSO 更好（或相似）的分类性能，在六个数据集上选择的特征数量更少。此外，PPSO 在大多数数据集上的泛化能力和学习能力方面也优于三种比较方法（传统方法），并且与一种方法的性能相似。

相似文献

A New Representation in PSO for Discretization-Based Feature Selection.PSO 中基于离散化的特征选择的新表示。

IEEE Trans Cybern. 2018 Jun;48(6):1733-1746. doi: 10.1109/TCYB.2017.2714145. Epub 2017 Jun 23.

Particle swarm optimization for feature selection in classification: a multi-objective approach.粒子群优化在分类中的特征选择：一种多目标方法。

IEEE Trans Cybern. 2013 Dec;43(6):1656-71. doi: 10.1109/TSMCB.2012.2227469.

An improved binary particle swarm optimization algorithm for clinical cancer biomarker identification in microarray data.一种用于微阵列数据中临床癌症生物标志物识别的改进二元粒子群优化算法。

Comput Methods Programs Biomed. 2024 Feb;244:107987. doi: 10.1016/j.cmpb.2023.107987. Epub 2023 Dec 21.

Multi-objective Evolutionary Approach for the Performance Improvement of Learners using Ensembling Feature Selection and Discretization Technique on Medical Data.基于集成特征选择和离散化技术的医学数据中学习者性能改进的多目标进化方法。

Curr Med Imaging. 2020;16(4):355-370. doi: 10.2174/1573405614666180903114534.

A Cooperative Coevolutionary Approach to Discretization-Based Feature Selection for High-Dimensional Data.一种用于高维数据基于离散化的特征选择的协同进化方法。

Entropy (Basel). 2020 Jun 1;22(6):613. doi: 10.3390/e22060613.

A Hybrid Particle Swarm Optimization Algorithm with Dynamic Adjustment of Inertia Weight Based on a New Feature Selection Method to Optimize SVM Parameters.一种基于新特征选择方法动态调整惯性权重的混合粒子群优化算法以优化支持向量机参数

Entropy (Basel). 2023 Mar 19;25(3):531. doi: 10.3390/e25030531.

Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in Classification.用于分类中基于成本的特征选择的多目标粒子群优化方法

IEEE/ACM Trans Comput Biol Bioinform. 2017 Jan-Feb;14(1):64-75. doi: 10.1109/TCBB.2015.2476796. Epub 2015 Sep 4.

A hybrid feature selection model based on butterfly optimization algorithm: COVID-19 as a case study.一种基于蝴蝶优化算法的混合特征选择模型：以COVID-19为例

Expert Syst. 2022 Mar;39(3):e12786. doi: 10.1111/exsy.12786. Epub 2021 Jul 29.

Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis.基于 PSO 和粗糙集的监督混合特征选择在医学诊断中的应用。

Comput Methods Programs Biomed. 2014;113(1):175-85. doi: 10.1016/j.cmpb.2013.10.007. Epub 2013 Oct 16.

Multivariate Discretization Based on Evolutionary Cut Points Selection for Classification.基于进化切点选择的多元离散化分类。

IEEE Trans Cybern. 2016 Mar;46(3):595-608. doi: 10.1109/TCYB.2015.2410143. Epub 2015 Mar 18.

引用本文的文献

Intelligent Joint Space Path Planning: Enhancing Motion Feasibility with Goal-Driven and Potential Field Strategies.智能关节空间路径规划：通过目标驱动和势场策略提高运动可行性。

Sensors (Basel). 2025 Jul 12;25(14):4370. doi: 10.3390/s25144370.

A hybrid feature selection algorithm combining information gain and grouping particle swarm optimization for cancer diagnosis.一种结合信息增益和分组粒子群优化的混合特征选择算法用于癌症诊断。

PLoS One. 2024 Mar 11;19(3):e0290332. doi: 10.1371/journal.pone.0290332. eCollection 2024.

A novel framework of MOPSO-GDM in recognition of Alzheimer's EEG-based functional network.一种基于脑电图识别阿尔茨海默病功能网络的多目标粒子群优化-广义判别模型新框架。

Front Aging Neurosci. 2023 Jun 29;15:1160534. doi: 10.3389/fnagi.2023.1160534. eCollection 2023.

Feature Selection Based on Adaptive Particle Swarm Optimization with Leadership Learning.基于具有领导力学习的自适应粒子群优化的特征选择。

Comput Intell Neurosci. 2022 Aug 28;2022:1825341. doi: 10.1155/2022/1825341. eCollection 2022.

An adaptive and altruistic PSO-based deep feature selection method for Pneumonia detection from Chest X-rays.一种基于自适应和利他粒子群优化算法的胸部X光片肺炎检测深度特征选择方法。

Appl Soft Comput. 2022 Oct;128:109464. doi: 10.1016/j.asoc.2022.109464. Epub 2022 Aug 10.

Improved Binary Grasshopper Optimization Algorithm for Feature Selection Problem.用于特征选择问题的改进二进制蚱蜢优化算法

Entropy (Basel). 2022 May 31;24(6):777. doi: 10.3390/e24060777.

Feature Selection in High Dimensional Biomedical Data Based on BF-SFLA.基于布谷鸟搜索-正弦余弦算法的高维生物医学数据特征选择

Front Neurosci. 2022 Apr 18;16:854685. doi: 10.3389/fnins.2022.854685. eCollection 2022.

A graph-based gene selection method for medical diagnosis problems using a many-objective PSO algorithm.基于图的基因选择方法，用于使用多目标 PSO 算法解决医学诊断问题。

BMC Med Inform Decis Mak. 2021 Nov 27;21(1):333. doi: 10.1186/s12911-021-01696-3.

A Cooperative Coevolutionary Approach to Discretization-Based Feature Selection for High-Dimensional Data.一种用于高维数据基于离散化的特征选择的协同进化方法。

Entropy (Basel). 2020 Jun 1;22(6):613. doi: 10.3390/e22060613.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PSO 中基于离散化的特征选择的新表示。

A New Representation in PSO for Discretization-Based Feature Selection.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献