Department of Nanobiomedical Science and WCU Research Center of Nanobiomedical Science, Dankook University, Cheonan, South Korea.
PLoS One. 2012;7(7):e40419. doi: 10.1371/journal.pone.0040419. Epub 2012 Jul 6.
The goal of feature selection is to select useful features and simultaneously exclude garbage features from a given dataset for classification purposes. This is expected to bring reduction of processing time and improvement of classification accuracy.
In this study, we devised a new feature selection algorithm (CBFS) based on clearness of features. Feature clearness expresses separability among classes in a feature. Highly clear features contribute towards obtaining high classification accuracy. CScore is a measure to score clearness of each feature and is based on clustered samples to centroid of classes in a feature. We also suggest combining CBFS and other algorithms to improve classification accuracy.
CONCLUSIONS/SIGNIFICANCE: From the experiment we confirm that CBFS is more excellent than up-to-date feature selection algorithms including FeaLect. CBFS can be applied to microarray gene selection, text categorization, and image classification.
特征选择的目标是从给定的数据集中选择有用的特征,并同时排除垃圾特征,以用于分类目的。这预计将减少处理时间并提高分类准确性。
在这项研究中,我们设计了一种新的基于特征清晰度的特征选择算法(CBFS)。特征清晰度表示特征中类之间的可分离性。高度清晰的特征有助于获得高分类准确性。CScore 是一种衡量每个特征清晰度的指标,它基于聚类样本到特征中类的质心。我们还建议结合 CBFS 和其他算法来提高分类准确性。
结论/意义:从实验中我们证实,CBFS 比包括 FeaLect 在内的最新特征选择算法更优秀。CBFS 可应用于微阵列基因选择、文本分类和图像分类。