一种用于 DNA 微阵列数据的混合特征选择方法。

A hybrid feature selection method for DNA microarray data.

机构信息

Department of Chemical Engineering, I-Shou University, Kaohsiung 80041, Taiwan.

出版信息

Comput Biol Med. 2011 Apr;41(4):228-37. doi: 10.1016/j.compbiomed.2011.02.004. Epub 2011 Mar 3.

DOI:10.1016/j.compbiomed.2011.02.004

Abstract

Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. In cancer classification, available training data sets are generally of a fairly small sample size compared to the number of genes involved. Along with training data limitations, this constitutes a challenge to certain classification methods. Feature (gene) selection can be used to successfully extract those genes that directly influence classification accuracy and to eliminate genes which have no influence on it. This significantly improves calculation performance and classification accuracy. In this paper, correlation-based feature selection (CFS) and the Taguchi-genetic algorithm (TGA) method were combined into a hybrid method, and the K-nearest neighbor (KNN) with the leave-one-out cross-validation (LOOCV) method served as a classifier for eleven classification profiles to calculate the classification accuracy. Experimental results show that the proposed method reduced redundant features effectively and achieved superior classification accuracy. The classification accuracy obtained by the proposed method was higher in ten out of the eleven gene expression data set test problems when compared to other classification methods from the literature.

摘要

基因表达谱代表细胞在分子水平上的状态，具有作为医学诊断工具的巨大潜力。在癌症分类中，与所涉及的基因数量相比，可用的训练数据集通常样本量相当小。除了训练数据的限制外，这对某些分类方法构成了挑战。特征（基因）选择可用于成功提取那些直接影响分类准确性的基因，并消除对其没有影响的基因。这显著提高了计算性能和分类准确性。在本文中，基于相关性的特征选择（CFS）和 Taguchi 遗传算法（TGA）方法被组合成一种混合方法，而 K-最近邻（KNN）与留一交叉验证（LOOCV）方法一起作为分类器，用于计算十一个分类谱的分类准确性。实验结果表明，所提出的方法有效地减少了冗余特征，并获得了更高的分类准确性。在所提出的方法与文献中的其他分类方法相比，在所测试的十一个基因表达数据集问题中，有十个问题的分类准确性更高。

相似文献

A hybrid feature selection method for DNA microarray data.

Comput Biol Med. 2011 Apr;41(4):228-37. doi: 10.1016/j.compbiomed.2011.02.004. Epub 2011 Mar 3.

Improved binary PSO for feature selection using gene expression data.

Comput Biol Chem. 2008 Feb;32(1):29-37. doi: 10.1016/j.compbiolchem.2007.09.005. Epub 2007 Sep 25.

Tabu search and binary particle swarm optimization for feature selection using microarray data.

J Comput Biol. 2009 Dec;16(12):1689-703. doi: 10.1089/cmb.2007.0211.

A novel feature selection approach for biomedical data classification.

J Biomed Inform. 2010 Feb;43(1):15-23. doi: 10.1016/j.jbi.2009.07.008. Epub 2009 Jul 30.

Genetic test bed for feature selection.

Bioinformatics. 2006 Apr 1;22(7):837-42. doi: 10.1093/bioinformatics/btl008. Epub 2006 Jan 20.

Tumor classification ranking from microarray data.

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.

Optimized between-group classification: a new jackknife-based gene selection procedure for genome-wide expression data.

BMC Bioinformatics. 2005 Sep 28;6:239. doi: 10.1186/1471-2105-6-239.

Classification of intramural metastases and lymph node metastases of esophageal cancer from gene expression based on boosting and projective adaptive resonance theory.

J Biosci Bioeng. 2006 Jul;102(1):46-52. doi: 10.1263/jbb.102.46.

Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data.

Artif Intell Med. 2011 Sep;53(1):47-56. doi: 10.1016/j.artmed.2011.06.008. Epub 2011 Jul 19.

Correlation-based gene selection and classification using Taguchi-BPSO.

Methods Inf Med. 2010;49(3):254-68. doi: 10.3414/ME09-01-0010. Epub 2010 Feb 5.

引用本文的文献

EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm.

Entropy (Basel). 2022 Jun 25;24(7):873. doi: 10.3390/e24070873.

Feature selection revisited in the single-cell era.

Genome Biol. 2021 Dec 1;22(1):321. doi: 10.1186/s13059-021-02544-3.

A framework model using multifilter feature selection to enhance colon cancer classification.

PLoS One. 2021 Apr 16;16(4):e0249094. doi: 10.1371/journal.pone.0249094. eCollection 2021.

An efficient gene selection method for microarray data based on LASSO and BPSO.

BMC Bioinformatics. 2019 Dec 30;20(Suppl 22):715. doi: 10.1186/s12859-019-3228-0.

Improving classification accuracy of cancer types using parallel hybrid feature selection on microarray gene expression data.

Genes Genomics. 2019 Nov;41(11):1301-1313. doi: 10.1007/s13258-019-00859-x. Epub 2019 Aug 19.

Co-ABC: Correlation artificial bee colony algorithm for biomarker gene discovery using gene expression profile.

Saudi J Biol Sci. 2018 Jul;25(5):895-903. doi: 10.1016/j.sjbs.2017.12.012. Epub 2018 Jan 3.

Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine.

J Med Signals Sens. 2018 Jan-Mar;8(1):1-11.

The Correlation-Base-Selection Algorithm for Diagnostic Schizophrenia Based on Blood-Based Gene Expression Signatures.

Biomed Res Int. 2017;2017:7860506. doi: 10.1155/2017/7860506. Epub 2017 Feb 9.

Gene selection for cancer classification with the help of bees.

BMC Med Genomics. 2016 Aug 10;9 Suppl 2(Suppl 2):47. doi: 10.1186/s12920-016-0204-7.

Biomarker Discovery Based on Hybrid Optimization Algorithm and Artificial Neural Networks on Microarray Data for Cancer Classification.

J Med Signals Sens. 2015 Apr-Jun;5(2):88-96.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于 DNA 微阵列数据的混合特征选择方法。

A hybrid feature selection method for DNA microarray data.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献