Huang Mei-Ling, Hung Yung-Hsiang, Lee W M, Li R K, Jiang Bo-Ru
Department of Industrial Engineering and Management, National Chin-Yi University of Technology, No. 57, Sec. 2, Zhong-Shan Road, Taiping District, Taichung 41170, Taiwan.
Department of Industrial Engineering & Management, National Chiao-Tung University, No. 1001, Ta-Hsueh Road, Hsinchu 300, Taiwan.
ScientificWorldJournal. 2014;2014:795624. doi: 10.1155/2014/795624. Epub 2014 Sep 10.
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
最近,支持向量机(SVM)在分类和预测方面表现出色,被广泛应用于疾病诊断或医疗辅助。然而,SVM仅在两组分类问题上表现良好。本研究结合特征选择和SVM递归特征消除(SVM-RFE)来研究皮肤病学和动物园数据库多类问题的分类准确率。皮肤病学数据集包含33个特征变量、1个类别变量和366个测试实例;动物园数据集包含16个特征变量、1个类别变量和101个测试实例。两个数据集中的特征变量按解释力降序排列,通过SVM-RFE选择不同的特征集以探索分类准确率。同时,将田口方法与SVM分类器联合起来,以优化参数C和γ,提高多类分类的准确率。实验结果表明,对皮肤病学和动物园数据库进行SVM-RFE特征选择和田口参数优化后,分类准确率可超过95%。