Ni Xiao, Zhang Hao Helen, Zhang Daowen
Department of Statistics, North Carolina State University.
J Multivar Anal. 2009 Oct 1;100(9):2100-2111. doi: 10.1016/j.jmva.2009.06.009.
We propose and study a unified procedure for variable selection in partially linear models. A new type of double-penalized least squares is formulated, using the smoothing spline to estimate the nonparametric part and applying a shrinkage penalty on parametric components to achieve model parsimony. Theoretically we show that, with proper choices of the smoothing and regularization parameters, the proposed procedure can be as efficient as the oracle estimator (Fan and Li, 2001). We also study the asymptotic properties of the estimator when the number of parametric effects diverges with the sample size. Frequentist and Bayesian estimates of the covariance and confidence intervals are derived for the estimators. One great advantage of this procedure is its linear mixed model (LMM) representation, which greatly facilitates its implementation by using standard statistical software. Furthermore, the LMM framework enables one to treat the smoothing parameter as a variance component and hence conveniently estimate it together with other regression coefficients. Extensive numerical studies are conducted to demonstrate the effective performance of the proposed procedure.
我们提出并研究了一种用于部分线性模型变量选择的统一方法。构建了一种新型的双惩罚最小二乘法,使用平滑样条估计非参数部分,并对参数分量施加收缩惩罚以实现模型简约性。理论上我们证明,通过适当选择平滑参数和正则化参数,所提出的方法可以与最优估计器(Fan和Li,2001)一样有效。我们还研究了参数效应数量随样本量发散时估计器的渐近性质。推导了估计器的协方差和置信区间的频率主义估计和贝叶斯估计。该方法的一个很大优点是其线性混合模型(LMM)表示,这极大地便于使用标准统计软件来实现它。此外,LMM框架使人们能够将平滑参数视为方差分量,从而方便地与其他回归系数一起估计它。进行了广泛的数值研究以证明所提出方法的有效性能。