Shen Qi, Shi Wei-Min, Kong Wei
Chemistry Department, Zhengzhou University, Zhengzhou, China.
Comput Biol Chem. 2008 Feb;32(1):52-9. doi: 10.1016/j.compbiolchem.2007.10.001. Epub 2007 Oct 22.
Gene expression data are characterized by thousands even tens of thousands of measured genes on only a few tissue samples. This can lead either to possible overfitting and dimensional curse or even to a complete failure in analysis of microarray data. Gene selection is an important component for gene expression-based tumor classification systems. In this paper, we develop a hybrid particle swarm optimization (PSO) and tabu search (HPSOTS) approach for gene selection for tumor classification. The incorporation of tabu search (TS) as a local improvement procedure enables the algorithm HPSOTS to overleap local optima and show satisfactory performance. The proposed approach is applied to three different microarray data sets. Moreover, we compare the performance of HPSOTS on these datasets to that of stepwise selection, the pure TS and PSO algorithm. It has been demonstrated that the HPSOTS is a useful tool for gene selection and mining high dimension data.
基因表达数据的特点是仅在少数组织样本上就有成千甚至上万个被测量的基因。这可能导致过拟合和维度灾难,甚至可能导致微阵列数据分析完全失败。基因选择是基于基因表达的肿瘤分类系统的一个重要组成部分。在本文中,我们开发了一种用于肿瘤分类基因选择的混合粒子群优化(PSO)和禁忌搜索(HPSOTS)方法。将禁忌搜索(TS)作为一种局部改进过程纳入,使得算法HPSOTS能够跨越局部最优解并表现出令人满意的性能。所提出的方法应用于三个不同的微阵列数据集。此外,我们将HPSOTS在这些数据集上的性能与逐步选择、纯TS和PSO算法的性能进行了比较。结果表明,HPSOTS是一种用于基因选择和挖掘高维数据的有用工具。