IEEE Trans Cybern. 2019 Aug;49(8):2860-2873. doi: 10.1109/TCYB.2018.2829811. Epub 2018 May 10.
Relevant gene selection is crucial for analyzing cancer gene expression datasets including two types of tumors in cancer classification. Intrinsic interactions among selected genes cannot be fully identified by most existing gene selection methods. In this paper, we propose a weighted general group lasso (WGGL) model to select cancer genes in groups. A gene grouping heuristic method is presented based on weighted gene co-expression network analysis. To determine the importance of genes and groups, a method for calculating gene and group weights is presented in terms of joint mutual information. To implement the complex calculation process of WGGL, a gene selection algorithm is developed. Experimental results on both random and three cancer gene expression datasets demonstrate that the proposed model achieves better classification performance than two existing state-of-the-art gene selection methods.
相关基因选择对于分析包括两种肿瘤的癌症基因表达数据集至关重要,癌症分类。大多数现有的基因选择方法无法充分识别所选基因之间的内在相互作用。在本文中,我们提出了一种加权广义组套索(WGGL)模型,用于选择成组的癌症基因。基于加权基因共表达网络分析,提出了一种基因分组启发式方法。为了确定基因和组的重要性,提出了一种基于联合互信息的计算基因和组权重的方法。为了实现 WGGL 的复杂计算过程,开发了一种基因选择算法。在随机和三个癌症基因表达数据集上的实验结果表明,所提出的模型比两种现有的最先进的基因选择方法具有更好的分类性能。