Chiang J H, Ho S H
Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, Republic of China.
IEEE Trans Nanobioscience. 2008 Mar;7(1):91-9. doi: 10.1109/TNB.2008.2000142.
This paper presents a novel rough-based feature selection method for gene expression data analysis. It can find the relevant features without requiring the number of clusters to be known a priori and identify the centers that approximate to the correct ones. In this paper, we attempt to introduce a prediction scheme that combines the rough-based feature selection method with radial basis function neural network. For further consider the effect of different feature selection methods and classifiers on this prediction process, we use the NaIve Bayes and linear support vector machine as classifiers, and compare the performance with other feature selection methods, including information gain and principle component analysis. We demonstrate the performance by several published datasets and the results show that our proposed method can achieve high classification accuracy rate.
本文提出了一种用于基因表达数据分析的基于粗糙集的新型特征选择方法。它能够在无需事先知道聚类数量的情况下找到相关特征,并识别出近似正确的中心。在本文中,我们尝试引入一种将基于粗糙集的特征选择方法与径向基函数神经网络相结合的预测方案。为了进一步考虑不同特征选择方法和分类器对该预测过程的影响,我们使用朴素贝叶斯和线性支持向量机作为分类器,并与其他特征选择方法(包括信息增益和主成分分析)的性能进行比较。我们通过几个已发表的数据集展示了该方法的性能,结果表明我们提出的方法能够实现较高的分类准确率。