Mao Yong, Zhou Xiao-Bo, Pi Dao-Ying, Sun You-Xian, Wong Stephen T C
National Laboratory of Industrial Control Technology, Institute of Modern Control Engineering, Zhejiang University, Hangzhou 310027, China.
J Zhejiang Univ Sci B. 2005 Oct;6(10):961-73. doi: 10.1631/jzus.2005.B0961.
In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear statistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two representative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method performs well in selecting genes and achieves high classification accuracies with these genes.
在基于微阵列的癌症分类中,由于变量数量众多、样本数量较少以及其非线性特性,基因选择是一个重要问题。使用传统的线性统计方法难以获得令人满意的结果。基于支持向量机的递归特征消除(SVM RFE)是一种用于基因选择和癌症分类的有效算法,它被集成到一个统一的框架中。在本文中,我们提出了一种新方法来选择上述算法的参数,该算法采用高斯核支持向量机实现,作为通过遗传算法搜索一对最优参数来选择明显最佳参数这一常见做法的更好替代方案。出于实际原因,还讨论了该方法的快速实现问题。所提出的方法在两个具有代表性的遗传性乳腺癌和急性白血病数据集上进行了测试。实验结果表明,所提出的方法在基因选择方面表现良好,并使用这些基因实现了高分类准确率。