Suppr超能文献

基于智能水滴的改进型混合特征选择方法在微阵列数据处理中的应用。

Improved intelligent water drop-based hybrid feature selection method for microarray data processing.

机构信息

Software Engineering Department, Al-Ahliyya Amman University, Amman, Jordan; King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan.

King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan.

出版信息

Comput Biol Chem. 2023 Apr;103:107809. doi: 10.1016/j.compbiolchem.2022.107809. Epub 2023 Jan 13.

Abstract

Classifying microarray datasets, which usually contains many noise genes that degrade the performance of classifiers and decrease classification accuracy rate, is a competitive research topic. Feature selection (FS) is one of the most practical ways for finding the most optimal subset of genes that increases classification's accuracy for diagnostic and prognostic prediction of tumor cancer from the microarray datasets. This means that we always need to develop more efficient FS methods, that select only optimal or close-to-optimal subset of features to improve classification performance. In this paper, we propose a hybrid FS method for microarray data processing, that combines an ensemble filter with an Improved Intelligent Water Drop (IIWD) algorithm as a wrapper by adding one of three local search (LS) algorithms: Tabu search (TS), Novel LS algorithm (NLSA), or Hill Climbing (HC) in each iteration from IWD, and using a correlation coefficient filter as a heuristic undesirability (HUD) for next node selection in the original IWD algorithm. The effects of adding three different LS algorithms to the proposed IIWD algorithm have been evaluated through comparing the performance of the proposed ensemble filter-IIWD-based wrapper without adding any LS algorithms named (PHFS-IWD) FS method versus its performance when adding a specific LS algorithm from (TS, NLSA or HC) in FS methods named, (PHFS-IWDTS, PHFS-IWDNLSA, and PHFS-IWDHC), respectively. Naïve Bayes(NB) classifier with five microarray datasets have been deployed for evaluating and comparing the proposed hybrid FS methods. Results show that using LS algorithms in each iteration from the IWD algorithm improves F-score value with an average equal to 5% compared with PHFS-IWD. Also, PHFS-IWDNLSA improves the F-score value with an average of 4.15% over PHFS-IWDTS, and 5.67% over PHFS-IWDHC while PHFS-IWDTS outperformed PHFS-IWDHC with an average of increment equal to 1.6%. On the other hand, the proposed hybrid-based FS methods improve accuracy with an average equal to 8.92% in three out of five datasets and decrease the number of genes with a percentage of 58.5% in all five datasets compared with six of the most recent state-of-the-art FS methods.

摘要

对微阵列数据集进行分类是一个具有挑战性的研究课题,这些数据集通常包含许多降低分类器性能和分类准确率的噪声基因。特征选择(FS)是从微阵列数据集中找到最佳基因子集的最实用方法之一,该子集可提高肿瘤癌症诊断和预后预测的准确性。这意味着我们始终需要开发更有效的 FS 方法,仅选择最佳或接近最佳的特征子集,以提高分类性能。在本文中,我们提出了一种用于微阵列数据处理的混合 FS 方法,该方法将集成过滤器与改进的智能水滴(IIWD)算法作为包装器结合在一起,在每个迭代中添加三种局部搜索(LS)算法之一:禁忌搜索(TS),新的 LS 算法(NLSA)或爬山(HC),并在原始 IWD 算法中使用相关系数过滤器作为下一个节点选择的启发式不希望性(HUD)。通过比较未添加任何 LS 算法的基于提议的集成过滤器-IIWD 的包装器(命名为 PHFS-IWD)FS 方法的性能与其在 FS 方法中添加特定 LS 算法(TS、NLSA 或 HC)时的性能,评估了在提议的 IIWD 算法中添加三种不同 LS 算法的效果(命名为 PHFS-IWDTS、PHFS-IWDNLSA 和 PHFS-IWDHC)。使用五种微阵列数据集部署了朴素贝叶斯(NB)分类器,用于评估和比较所提出的混合 FS 方法。结果表明,在 IWD 算法的每个迭代中使用 LS 算法可将 F-score 值提高 5%,平均提高 5%。与 PHFS-IWD 相比,PHFS-IWDNLSA 还将 F-score 值提高了 4.15%,比 PHFS-IWDHC 提高了 5.67%,而 PHFS-IWDTS 则比 PHFS-IWDHC 平均提高了 1.6%。另一方面,与六个最新的最先进的 FS 方法相比,所提出的基于混合的 FS 方法在五个数据集中有三个数据集提高了准确性,平均提高了 8.92%,并在所有五个数据集减少了 58.5%的基因数量。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验