Suppr超能文献

基于混合模糊粗糙集的启发式算法在高维基因数据特征选择中的一种新的种群初始化方法。

A new population initialization of metaheuristic algorithms based on hybrid fuzzy rough set for high-dimensional gene data feature selection.

机构信息

College of Computer Science and Technology, Jilin University, Changchun, 130012, China.

College of Information Technology, Jilin Agricultural University, Changchun, 130118, China.

出版信息

Comput Biol Med. 2023 Nov;166:107538. doi: 10.1016/j.compbiomed.2023.107538. Epub 2023 Oct 4.

Abstract

In the realm of modern medicine and biology, vast amounts of genetic data with high complexity are available. However, dealing with such high-dimensional data poses challenges due to increased processing complexity and size. Identifying critical genes to reduce data dimensionality is essential. The filter-wrapper hybrid method is a commonly used approach in feature selection. Most of these methods employ filters such as MRMR and ReliefF, but the performance of these simple filters is limited. Rough set methods, on the other hand, are a type of filter method that outperforms traditional filters. Simultaneously, many studies have pointed out the crucial importance of good initialization strategies for the performance of the metaheuristic algorithm (a type of wrapper-based method). Combining these two points, this paper proposes a novel filter-wrapper hybrid method for high-dimensional feature selection. To be specific, we utilize the variant of bWOA (binary Whale Optimization Algorithm) based on Hybrid Fuzzy Rough Set to perform attribute reduction, and the reduced attributes are used as prior knowledge to initialize the population. We then employ metaheuristics for further feature selection based on this initialized population. We conducted experiments using five different algorithms on 14 UCI datasets. The experiment results show that after applying the initialization method proposed in this article, the performance of five enhanced algorithms, has shown significant improvement. Particularly, the improved bMFO using our initialization method: fuzzy_bMFO outperformed six currently advanced algorithms, indicating that our initialization method for metaheuristic algorithms is suitable for high-dimensional feature selection tasks.

摘要

在现代医学和生物学领域,存在大量具有高度复杂性的基因数据。然而,由于处理的高维数据的复杂性和规模增加,对其进行处理面临着挑战。确定关键基因以降低数据维度至关重要。过滤-包装混合方法是特征选择中常用的方法之一。这些方法大多数都采用 MRMR 和 ReliefF 等过滤器,但这些简单过滤器的性能有限。另一方面,粗糙集方法是一种优于传统过滤器的过滤方法。同时,许多研究指出,对于元启发式算法(一种基于包装的方法)的性能而言,良好的初始化策略非常重要。结合这两点,本文提出了一种用于高维特征选择的新型过滤-包装混合方法。具体来说,我们利用基于 Hybrid Fuzzy Rough Set 的二进制鲸鱼优化算法 (bWOA) 变体来执行属性约简,然后将约简后的属性用作初始化种群的先验知识。之后,我们基于这个初始化的种群,利用元启发式算法进一步进行特征选择。我们在 14 个 UCI 数据集上使用五种不同的算法进行了实验。实验结果表明,在应用本文提出的初始化方法后,五种增强算法的性能都得到了显著提高。特别是,我们的初始化方法改进的 bMFO(模糊 bMFO)在性能上优于六个目前先进的算法,这表明我们的元启发式算法初始化方法适用于高维特征选择任务。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验