Department of Nanobiomedical Science, Dankook University, Cheonan 330-714, Republic of Korea.
Comput Biol Chem. 2012 Feb;36:1-12. doi: 10.1016/j.compbiolchem.2011.11.002. Epub 2011 Nov 28.
Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset.
分类分析自 1936 年以来一直在不断发展。由于 KNN、ANN 和 SVM 等分类器的发展以及数据预处理领域的发展,该研究领域取得了进展。在分类工作之前,对于微阵列等非常高维的数据,需要进行特征(基因)选择。特征选择的目标是选择一组信息量较大的特征,以减少处理时间并提供更高的分类准确性。在这项研究中,我们设计了一种用于微阵列数据的人工基因制造(AGM)方法,以提高分类准确性。我们的人工基因来自整个微阵列数据集,并与分类分析的基因选择结果相结合。我们通过实验确认在插入人工基因后分类准确性有明显提高。我们的人工基因对流行的特征(基因)选择算法和分类器都有很好的效果。所提出的方法可以应用于任何类型的高维数据集。