Suppr超能文献

一种基于有效范围的基因表达数据分类新过滤方法。

A New Filter Approach Based on Effective Ranges for Classification of Gene Expression Data.

作者信息

Turfan Derya, Altunkaynak Bulent, Yeniay Özgür

机构信息

Department of Statistics, Hacettepe University, Ankara, Turkey.

Department of Statistics, Gazi University, Ankara, Turkey.

出版信息

Big Data. 2024 Aug;12(4):312-330. doi: 10.1089/big.2022.0086. Epub 2023 Sep 4.

Abstract

Over the years, many studies have been carried out to reduce and eliminate the effects of diseases on human health. Gene expression data sets play a critical role in diagnosing and treating diseases. These data sets consist of thousands of genes and a small number of sample sizes. This situation creates the curse of dimensionality and it becomes problematic to analyze such data sets. One of the most effective strategies to solve this problem is feature selection methods. Feature selection is a preprocessing step to improve classification performance by selecting the most relevant and informative features while increasing the accuracy of classification. In this article, we propose a new statistically based filter method for the feature selection approach named Effective Range-based Feature Selection Algorithm (FSAER). As an extension of the previous Effective Range based Gene Selection (ERGS) and Improved Feature Selection based on Effective Range (IFSER) algorithms, our novel method includes the advantages of both methods while taking into account the disjoint area. To illustrate the efficacy of the proposed algorithm, the experiments have been conducted on six benchmark gene expression data sets. The results of the FSAER and the other filter methods have been compared in terms of classification accuracies to demonstrate the effectiveness of the proposed method. For classification methods, support vector machines, naive Bayes classifier, and k-nearest neighbor algorithms have been used.

摘要

多年来,人们进行了许多研究以减少和消除疾病对人类健康的影响。基因表达数据集在疾病诊断和治疗中起着关键作用。这些数据集由数千个基因和少量样本组成。这种情况产生了维数灾难,分析此类数据集变得很困难。解决这个问题最有效的策略之一是特征选择方法。特征选择是一个预处理步骤,通过选择最相关和信息丰富的特征来提高分类性能,同时提高分类的准确性。在本文中,我们为特征选择方法提出了一种新的基于统计的过滤方法,称为基于有效范围的特征选择算法(FSAER)。作为先前基于有效范围的基因选择(ERGS)和基于有效范围的改进特征选择(IFSER)算法的扩展,我们的新方法在考虑不相交区域的同时,兼具了这两种方法的优点。为了说明所提出算法的有效性,我们在六个基准基因表达数据集上进行了实验。通过比较FSAER与其他过滤方法的分类准确率,来证明所提方法的有效性。对于分类方法,我们使用了支持向量机、朴素贝叶斯分类器和k近邻算法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验