Han Fei, Sun Wei, Ling Qing-Hua
School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China.
School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China; School of Computer Science and Engineering, Jiangsu University of Science and Technology, Zhenjiang, China.
PLoS One. 2014 May 20;9(5):e97530. doi: 10.1371/journal.pone.0097530. eCollection 2014.
To obtain predictive genes with lower redundancy and better interpretability, a hybrid gene selection method encoding prior information is proposed in this paper. To begin with, the prior information referred to as gene-to-class sensitivity (GCS) of all genes from microarray data is exploited by a single hidden layered feedforward neural network (SLFN). Then, to select more representative and lower redundant genes, all genes are grouped into some clusters by K-means method, and some low sensitive genes are filtered out according to their GCS values. Finally, a modified binary particle swarm optimization (BPSO) encoding the GCS information is proposed to perform further gene selection from the remainder genes. For considering the GCS information, the proposed method selects those genes highly correlated to sample classes. Thus, the low redundant gene subsets obtained by the proposed method also contribute to improve classification accuracy on microarray data. The experiments results on some open microarray data verify the effectiveness and efficiency of the proposed approach.
为了获得冗余度更低且可解释性更强的预测基因,本文提出了一种编码先验信息的混合基因选择方法。首先,通过单隐藏层前馈神经网络(SLFN)利用来自微阵列数据的所有基因的先验信息,即基因对类的敏感性(GCS)。然后,为了选择更具代表性且冗余度更低的基因,采用K均值方法将所有基因分组为若干簇,并根据其GCS值过滤掉一些低敏感性基因。最后,提出了一种编码GCS信息的改进二进制粒子群优化(BPSO)算法,从剩余基因中进一步进行基因选择。考虑到GCS信息,该方法选择那些与样本类别高度相关的基因。因此,所提方法获得的低冗余基因子集也有助于提高微阵列数据的分类准确率。在一些公开微阵列数据上的实验结果验证了所提方法的有效性和高效性。