IEEE/ACM Trans Comput Biol Bioinform. 2018 Nov-Dec;15(6):2028-2038. doi: 10.1109/TCBB.2017.2761871. Epub 2017 Oct 11.
This paper deals with the problems of cancer classification and grouped gene selection. The weighted gene co-expression network on cancer microarray data is employed to identify modules corresponding to biological pathways, based on which a strategy of dividing genes into groups is presented. Using the conditional mutual information within each divided group, an integrated criterion is proposed and the data-driven weights are constructed. They are shown with the ability to evaluate both the individual gene significance and the influence to improve correlation of all the other pairwise genes in each group. Furthermore, an adaptive sparse group lasso is proposed, by which an improved blockwise descent algorithm is developed. The results on four cancer data sets demonstrate that the proposed adaptive sparse group lasso can effectively perform classification and grouped gene selection.
本文研究了癌症分类和基因分组选择的问题。基于癌症微阵列数据的加权基因共表达网络,识别出对应于生物途径的模块,在此基础上提出了一种将基因分组的策略。利用每个分组内的条件互信息,提出了一个综合准则,并构建了数据驱动的权重。它们显示出了评估个体基因显著性和改善每个分组中所有其他成对基因相关性的能力。此外,还提出了一种自适应稀疏分组 lasso,通过该方法开发了一种改进的分块下降算法。在四个癌症数据集上的实验结果表明,所提出的自适应稀疏分组 lasso 可以有效地进行分类和分组基因选择。