Huang Hai-Hui, Liu Xiao-Ying, Liang Yong
Faculty of Information Technology & State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau, 999078, China.
PLoS One. 2016 May 2;11(5):e0149675. doi: 10.1371/journal.pone.0149675. eCollection 2016.
Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 +2 regularization (HLR) function, a linear combination of L1/2 and L2 penalties, to select the relevant gene in the logistic regression. The HLR approach inherits some fascinating characteristics from L1/2 (sparsity) and L2 (grouping effect where highly correlated variables are in or out a model together) penalties. We also proposed a novel univariate HLR thresholding approach to update the estimated coefficients and developed the coordinate descent algorithm for the HLR penalized logistic regression model. The empirical results and simulations indicate that the proposed method is highly competitive amongst several state-of-the-art methods.
癌症分类与特征(基因)选择在基因组数据的知识发现中起着重要作用。尽管逻辑回归是最流行的分类方法之一,但它不会进行特征选择。在本文中,我们提出了一种新的混合L1/2 +2正则化(HLR)函数,即L1/2和L2惩罚的线性组合,用于在逻辑回归中选择相关基因。HLR方法继承了L1/2(稀疏性)和L2(高度相关变量一起进入或退出模型的分组效应)惩罚的一些吸引人的特性。我们还提出了一种新颖的单变量HLR阈值方法来更新估计系数,并为HLR惩罚逻辑回归模型开发了坐标下降算法。实证结果和模拟表明,所提出的方法在几种最先进的方法中具有很强的竞争力。