Wang Shuaiqun, Kong Wei, Deng Jin, Gao Shangce, Zeng Weiming
College of Information Engineering, Shanghai Maritime University, Shanghai, China.
Faculty of Engineering, University of Toyama, Toyama, Japan.
Comb Chem High Throughput Screen. 2018;21(6):420-430. doi: 10.2174/1386207321666180601074349.
Redundant information of microarray gene expression data makes it difficult for cancer classification. Hence, it is very important for researchers to find appropriate ways to select informative genes for better identification of cancer. This study was undertaken to present a hybrid feature selection method mRMR-ICA which combines minimum redundancy maximum relevance (mRMR) with imperialist competition algorithm (ICA) for cancer classification in this paper.
The presented algorithm mRMR-ICA utilizes mRMR to delete redundant genes as preprocessing and provide the small datasets for ICA for feature selection. It will use support vector machine (SVM) to evaluate the classification accuracy for feature genes. The fitness function includes classification accuracy and the number of selected genes.
Ten benchmark microarray gene expression datasets are used to test the performance of mRMR-ICA. Experimental results including the accuracy of cancer classification and the number of informative genes are improved for mRMR-ICA compared with the original ICA and other evolutionary algorithms.
The comparison results demonstrate that mRMR-ICA can effectively delete redundant genes to ensure that the algorithm selects fewer informative genes to get better classification results. It also can shorten calculation time and improve efficiency.
微阵列基因表达数据中的冗余信息给癌症分类带来困难。因此,对于研究人员来说,找到合适的方法来选择信息性基因以更好地识别癌症非常重要。本文开展此项研究旨在提出一种将最小冗余最大相关性(mRMR)与帝国主义竞争算法(ICA)相结合的混合特征选择方法mRMR - ICA用于癌症分类。
所提出的算法mRMR - ICA利用mRMR作为预处理来删除冗余基因,并为ICA提供小数据集进行特征选择。它将使用支持向量机(SVM)来评估特征基因的分类准确性。适应度函数包括分类准确性和所选基因的数量。
使用十个基准微阵列基因表达数据集来测试mRMR - ICA的性能。与原始ICA和其他进化算法相比,mRMR - ICA的实验结果在癌症分类准确性和信息性基因数量方面都有所提高。
比较结果表明,mRMR - ICA可以有效地删除冗余基因,确保算法选择较少的信息性基因以获得更好的分类结果。它还可以缩短计算时间并提高效率。