• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用自适应埃博拉优化搜索算法进行高维数据集的进化二进制特征选择。

Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets.

机构信息

Department of Computer Science, Faculty of Physical Sciences, Ahmadu Bello University, Zaria, Nigeria.

Unit for Data Science and Computing, North-West University, Potchefstroom, South Africa.

出版信息

PLoS One. 2023 Mar 17;18(3):e0282812. doi: 10.1371/journal.pone.0282812. eCollection 2023.

DOI:10.1371/journal.pone.0282812
PMID:36930670
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10022820/
Abstract

Feature selection problem represents the field of study that requires approximate algorithms to identify discriminative and optimally combined features. The evaluation and suitability of these selected features are often analyzed using classifiers. These features are locked with data increasingly being generated from different sources such as social media, surveillance systems, network applications, and medical records. The high dimensionality of these datasets often impairs the quality of the optimal combination of these features selected. The use of the binary optimization method has been proposed in the literature to address this challenge. However, the underlying deficiency of the single binary optimizer is transferred to the quality of the features selected. Though hybrid methods have been proposed, most still suffer from the inherited design limitation of the single combined methods. To address this, we proposed a novel hybrid binary optimization capable of effectively selecting features from increasingly high-dimensional datasets. The approach used in this study designed a sub-population selective mechanism that dynamically assigns individuals to a 2-level optimization process. The level-1 method first mutates items in the population and then reassigns them to a level-2 optimizer. The selective mechanism determines what sub-population is assigned for the level-2 optimizer based on the exploration and exploitation phase of the level-1 optimizer. In addition, we designed nested transfer (NT) functions and investigated the influence of the function on the level-1 optimizer. The binary Ebola optimization search algorithm (BEOSA) is applied for the level-1 mutation, while the simulated annealing (SA) and firefly (FFA) algorithms are investigated for the level-2 optimizer. The outcome of these are the HBEOSA-SA and HBEOSA-FFA, which are then investigated on the NT, and their corresponding variants HBEOSA-SA-NT and HBEOSA-FFA-NT with no NT applied. The hybrid methods were experimentally tested over high-dimensional datasets to address the challenge of feature selection. A comparative analysis was done on the methods to obtain performance variability with the low-dimensional datasets. Results obtained for classification accuracy for large, medium, and small-scale datasets are 0.995 using HBEOSA-FFA, 0.967 using HBEOSA-FFA-NT, and 0.953 using HBEOSA-FFA, respectively. Fitness and cost values relative to large, medium, and small-scale datasets are 0.066 and 0.934 using HBEOSA-FFA, 0.068 and 0.932 using HBEOSA-FFA, with 0.222 and 0.970 using HBEOSA-SA-NT, respectively. Findings from the study indicate that the HBEOSA-SA, HBEOSA-FFA, HBEOSA-SA-NT and HBEOSA-FFA-NT outperformed the BEOSA.

摘要

特征选择问题代表了需要近似算法来识别有区别和最优组合特征的研究领域。这些选定特征的评估和适用性通常使用分类器进行分析。这些特征与越来越多的来自不同来源的数据(如社交媒体、监控系统、网络应用程序和医疗记录)相关联。这些数据集的高维度通常会损害所选特征的最优组合的质量。文献中提出了使用二进制优化方法来解决这一挑战。然而,单个二进制优化器的基本缺陷被转移到了所选特征的质量上。虽然已经提出了混合方法,但大多数方法仍然受到单个组合方法的固有设计限制的影响。为了解决这个问题,我们提出了一种新的混合二进制优化方法,能够有效地从日益高维的数据集中选择特征。本研究采用的方法设计了一种子种群选择机制,该机制能够动态地将个体分配到两级优化过程中。一级方法首先对种群中的项目进行突变,然后将其重新分配到二级优化器中。选择机制根据一级优化器的探索和利用阶段来确定分配给二级优化器的子种群。此外,我们设计了嵌套转移(NT)函数,并研究了该函数对一级优化器的影响。二进制埃博拉优化搜索算法(BEOSA)应用于一级突变,而模拟退火(SA)和萤火虫(FFA)算法则用于二级优化器。结果是 HBEOSA-SA 和 HBEOSA-FFA,然后对它们进行 NT 调查,并对没有 NT 应用的 HBEOSA-SA-NT 和 HBEOSA-FFA-NT 进行相应的变体调查。混合方法在高维数据集上进行了实验测试,以解决特征选择的挑战。对这些方法进行了比较分析,以获得低维数据集的性能变化。对于大型、中型和小型数据集的分类准确性,使用 HBEOSA-FFA 获得了 0.995,使用 HBEOSA-FFA-NT 获得了 0.967,使用 HBEOSA-FFA 获得了 0.953。相对于大型、中型和小型数据集的适应度和成本值,使用 HBEOSA-FFA 获得了 0.066 和 0.934,使用 HBEOSA-FFA 获得了 0.068 和 0.932,使用 HBEOSA-SA-NT 获得了 0.222 和 0.970。研究结果表明,HBEOSA-SA、HBEOSA-FFA、HBEOSA-SA-NT 和 HBEOSA-FFA-NT 优于 BEOSA。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/bbf0fe587a5e/pone.0282812.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/6c29cba9a6d7/pone.0282812.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/7027d874f2e6/pone.0282812.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/b593039e5ef1/pone.0282812.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/6ef02ef8c558/pone.0282812.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/3eb48b9f0d49/pone.0282812.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/f1e986f16e9e/pone.0282812.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/ac1dc3fd7857/pone.0282812.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/c123a7bdc77d/pone.0282812.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/f130dee818bf/pone.0282812.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/bbf0fe587a5e/pone.0282812.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/6c29cba9a6d7/pone.0282812.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/7027d874f2e6/pone.0282812.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/b593039e5ef1/pone.0282812.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/6ef02ef8c558/pone.0282812.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/3eb48b9f0d49/pone.0282812.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/f1e986f16e9e/pone.0282812.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/ac1dc3fd7857/pone.0282812.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/c123a7bdc77d/pone.0282812.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/f130dee818bf/pone.0282812.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47aa/10022820/bbf0fe587a5e/pone.0282812.g010.jpg

相似文献

1
Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets.使用自适应埃博拉优化搜索算法进行高维数据集的进化二进制特征选择。
PLoS One. 2023 Mar 17;18(3):e0282812. doi: 10.1371/journal.pone.0282812. eCollection 2023.
2
A hybrid binary dwarf mongoose optimization algorithm with simulated annealing for feature selection on high dimensional multi-class datasets.一种结合了二进制矮袋鼠优化算法和模拟退火算法的混合算法,用于高维多类数据集上的特征选择。
Sci Rep. 2022 Sep 2;12(1):14945. doi: 10.1038/s41598-022-18993-0.
3
Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets.用于特征选择的二元模拟正态分布优化器:理论及在新冠肺炎数据集上的应用
Expert Syst Appl. 2022 Aug 15;200:116834. doi: 10.1016/j.eswa.2022.116834. Epub 2022 Mar 15.
4
An improved binary particle swarm optimization algorithm for clinical cancer biomarker identification in microarray data.一种用于微阵列数据中临床癌症生物标志物识别的改进二元粒子群优化算法。
Comput Methods Programs Biomed. 2024 Feb;244:107987. doi: 10.1016/j.cmpb.2023.107987. Epub 2023 Dec 21.
5
Hybrid Gradient Descent Grey Wolf Optimizer for Optimal Feature Selection.基于混合梯度下降灰狼优化算法的最优特征选择。
Biomed Res Int. 2021 Aug 28;2021:2555622. doi: 10.1155/2021/2555622. eCollection 2021.
6
An Innovative Excited-ACS-IDGWO Algorithm for Optimal Biomedical Data Feature Selection.一种创新的基于激发 ACS-IDGWO 算法的最优生物医学数据特征选择方法。
Biomed Res Int. 2020 Aug 17;2020:8506365. doi: 10.1155/2020/8506365. eCollection 2020.
7
An Improved Binary Walrus Optimizer with Golden Sine Disturbance and Population Regeneration Mechanism to Solve Feature Selection Problems.一种具有黄金正弦扰动和种群再生机制的改进二进制海象优化器用于解决特征选择问题。
Biomimetics (Basel). 2024 Aug 18;9(8):501. doi: 10.3390/biomimetics9080501.
8
An efficient binary Gradient-based optimizer for feature selection.一种用于特征选择的高效基于梯度的二元优化器。
Math Biosci Eng. 2021 Apr 30;18(4):3813-3854. doi: 10.3934/mbe.2021192.
9
An improved Differential evolution with Sailfish optimizer (DESFO) for handling feature selection problem.一种用于处理特征选择问题的改进型带旗鱼优化器的差分进化算法(DESFO)。
Sci Rep. 2024 Jun 12;14(1):13517. doi: 10.1038/s41598-024-63328-w.
10
Mutation-based Binary Aquila optimizer for gene selection in cancer classification.基于突变的二进制 Aqlua 优化器在癌症分类中的基因选择。
Comput Biol Chem. 2022 Dec;101:107767. doi: 10.1016/j.compbiolchem.2022.107767. Epub 2022 Sep 5.

引用本文的文献

1
Comparative analysis of the gazelle Optimizer and its variants.瞪羚优化器及其变体的比较分析。
Heliyon. 2024 Aug 16;10(17):e36425. doi: 10.1016/j.heliyon.2024.e36425. eCollection 2024 Sep 15.
2
Feature Selection Problem and Metaheuristics: A Systematic Literature Review about Its Formulation, Evaluation and Applications.特征选择问题与元启发式算法:关于其公式化、评估及应用的系统文献综述
Biomimetics (Basel). 2023 Dec 25;9(1):9. doi: 10.3390/biomimetics9010009.
3
A novel feature selection algorithm for identifying hub genes in lung cancer.

本文引用的文献

1
Artificial Intelligence-Based Robust Hybrid Algorithm Design and Implementation for Real-Time Detection of Plant Diseases in Agricultural Environments.基于人工智能的稳健混合算法设计与实现,用于农业环境中植物病害的实时检测
Biology (Basel). 2022 Nov 29;11(12):1732. doi: 10.3390/biology11121732.
2
Immunity-based Ebola optimization search algorithm for minimization of feature extraction with reduction in digital mammography using CNN models.基于免疫的埃博拉优化搜索算法,用于最小化使用 CNN 模型的数字乳腺 X 线摄影中的特征提取。
Sci Rep. 2022 Oct 26;12(1):17916. doi: 10.1038/s41598-022-22933-3.
3
Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems.
一种用于识别肺癌中枢纽基因的新型特征选择算法。
Sci Rep. 2023 Dec 7;13(1):21671. doi: 10.1038/s41598-023-48953-1.
4
A Hybrid Model with New Word Weighting for Fast Filtering Spam Short Texts.一种用于快速过滤垃圾短信的具有新词加权的混合模型。
Sensors (Basel). 2023 Nov 4;23(21):8975. doi: 10.3390/s23218975.
5
A bio-inspired convolution neural network architecture for automatic breast cancer detection and classification using RNA-Seq gene expression data.基于生物启发的卷积神经网络架构,用于使用 RNA-Seq 基因表达数据进行自动乳腺癌检测和分类。
Sci Rep. 2023 Sep 5;13(1):14644. doi: 10.1038/s41598-023-41731-z.
二进制矮狐优化器,用于解决高维特征选择问题。
PLoS One. 2022 Oct 6;17(10):e0274850. doi: 10.1371/journal.pone.0274850. eCollection 2022.
4
Multiclass feature selection with metaheuristic optimization algorithms: a review.基于元启发式优化算法的多类特征选择:综述
Neural Comput Appl. 2022;34(22):19751-19790. doi: 10.1007/s00521-022-07705-4. Epub 2022 Aug 30.
5
A hybrid binary dwarf mongoose optimization algorithm with simulated annealing for feature selection on high dimensional multi-class datasets.一种结合了二进制矮袋鼠优化算法和模拟退火算法的混合算法,用于高维多类数据集上的特征选择。
Sci Rep. 2022 Sep 2;12(1):14945. doi: 10.1038/s41598-022-18993-0.
6
An Improved Moth-Flame Optimization Algorithm with Adaptation Mechanism to Solve Numerical and Mechanical Engineering Problems.一种具有自适应机制的改进蛾火优化算法,用于解决数值和机械工程问题。
Entropy (Basel). 2021 Dec 6;23(12):1637. doi: 10.3390/e23121637.
7
Breast cancer detection from thermal images using a Grunwald-Letnikov-aided Dragonfly algorithm-based deep feature selection method.使用基于 Grünwald-Letnikov 辅助蜻蜓算法的深度特征选择方法从热图像中检测乳腺癌。
Comput Biol Med. 2022 Feb;141:105027. doi: 10.1016/j.compbiomed.2021.105027. Epub 2021 Nov 14.
8
Advanced arithmetic optimization algorithm for solving mechanical engineering design problems.高级算法优化在机械工程设计问题中的应用。
PLoS One. 2021 Aug 24;16(8):e0255703. doi: 10.1371/journal.pone.0255703. eCollection 2021.
9
A review of feature selection methods in medical applications.医学应用中的特征选择方法综述。
Comput Biol Med. 2019 Sep;112:103375. doi: 10.1016/j.compbiomed.2019.103375. Epub 2019 Jul 31.