Shen Qi, Shi Wei-Min, Kong Wei, Ye Bao-Xian
Chemistry Department, Zhengzhou University, Zhengzhou 450052, China; State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha 410082, China.
Talanta. 2007 Mar 15;71(4):1679-83. doi: 10.1016/j.talanta.2006.07.047. Epub 2006 Sep 1.
In the analysis of gene expression profiles, the number of tissue samples with genes expression levels available is usually small compared with the number of genes. This can lead either to possible overfitting or even to a complete failure in analysis of microarray data. The selection of genes that are really indicative of the tissue classification concerned is becoming one of the key steps in microarray studies. In the present paper, we have combined the modified discrete particle swarm optimization (PSO) and support vector machines (SVM) for tumor classification. The modified discrete PSO is applied to select genes, while SVM is used as the classifier or the evaluator. The proposed approach is used to the microarray data of 22 normal and 40 colon tumor tissues and showed good prediction performance. It has been demonstrated that the modified PSO is a useful tool for gene selection and mining high dimension data.
在基因表达谱分析中,与基因数量相比,可获得基因表达水平的组织样本数量通常较少。这可能导致微阵列数据分析中出现过拟合,甚至完全失败。选择真正能指示相关组织分类的基因已成为微阵列研究的关键步骤之一。在本文中,我们将改进的离散粒子群优化算法(PSO)与支持向量机(SVM)相结合用于肿瘤分类。改进的离散PSO用于选择基因,而SVM用作分类器或评估器。所提出的方法应用于22个正常结肠组织和40个结肠肿瘤组织的微阵列数据,显示出良好的预测性能。结果表明,改进的PSO是一种用于基因选择和挖掘高维数据的有用工具。