Zhang Xiang, Wu Yichao, Wang Lan, Li Runze
North Carolina State University, Raleigh, NC, USA.
The University of Minnesota, Minneapolis, MN, USA.
J R Stat Soc Series B Stat Methodol. 2016 Jan;78(1):53-76. doi: 10.1111/rssb.12100. Epub 2015 Jan 5.
The support vector machine (SVM) is a powerful binary classification tool with high accuracy and great flexibility. It has achieved great success, but its performance can be seriously impaired if many redundant covariates are included. Some efforts have been devoted to studying variable selection for SVMs, but asymptotic properties, such as variable selection consistency, are largely unknown when the number of predictors diverges to infinity. In this work, we establish a unified theory for a general class of nonconvex penalized SVMs. We first prove that in ultra-high dimensions, there exists one local minimizer to the objective function of nonconvex penalized SVMs possessing the desired oracle property. We further address the problem of nonunique local minimizers by showing that the local linear approximation algorithm is guaranteed to converge to the oracle estimator even in the ultra-high dimensional setting if an appropriate initial estimator is available. This condition on initial estimator is verified to be automatically valid as long as the dimensions are moderately high. Numerical examples provide supportive evidence.
支持向量机(SVM)是一种强大的二元分类工具,具有高精度和高度灵活性。它已经取得了巨大成功,但如果包含许多冗余协变量,其性能可能会受到严重损害。已经有一些努力致力于研究支持向量机的变量选择,但当预测变量的数量趋于无穷大时,诸如变量选择一致性等渐近性质在很大程度上仍然未知。在这项工作中,我们为一类一般的非凸惩罚支持向量机建立了统一理论。我们首先证明,在超高维度下,非凸惩罚支持向量机的目标函数存在一个具有所需神谕性质的局部极小值点。我们进一步通过表明如果有一个合适的初始估计量,局部线性近似算法即使在超高维度设置下也能保证收敛到神谕估计量,从而解决了非唯一局部极小值点的问题。只要维度适度高,就可以验证初始估计量的这个条件自动成立。数值例子提供了支持性证据。