Li Yang, Liu Jun S
Yang Li is Sr. Market Scientist, Vatic Labs LLC, New York, NY 10036. Jun S Liu is Professor, Department of Statistics, Harvard University, Cambridge, MA 02138; and is also co- Director for the Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, China.
J Am Stat Assoc. 2019;114(525):271-286. doi: 10.1080/01621459.2017.1401541. Epub 2018 Jun 28.
Under the logistic regression framework, we propose a forward-backward method, SODA, for variable selection with both main and quadratic interaction terms. In the forward stage, SODA adds in predictors that have significant overall effects, whereas in the backward stage SODA removes unimportant terms to optimize the extended Bayesian Information Criterion (EBIC). Compared with existing methods for variable selection in quadratic discriminant analysis, SODA can deal with high-dimensional data in which the number of predictors is much larger than the sample size and does not require the joint normality assumption on predictors, leading to much enhanced robustness. We further extend SODA to conduct variable selection and model fitting for general index models. Compared with existing variable selection methods based on the Sliced Inverse Regression (SIR) (Li, 1991), SODA requires neither linearity nor constant variance condition and is thus more robust. Our theoretical analysis establishes the variable-selection consistency of SODA under high-dimensional settings, and our simulation studies as well as real-data applications demonstrate superior performances of SODA in dealing with non-Gaussian design matrices in both logistic and general index models.
在逻辑回归框架下,我们提出了一种前向-后向方法SODA,用于同时包含主效应项和二次交互效应项的变量选择。在前向阶段,SODA添加具有显著总体效应的预测变量,而在后向阶段,SODA去除不重要的项以优化扩展贝叶斯信息准则(EBIC)。与二次判别分析中现有的变量选择方法相比,SODA能够处理预测变量数量远大于样本量的高维数据,并且不需要对预测变量进行联合正态性假设,从而大大增强了稳健性。我们进一步扩展SODA以对一般指数模型进行变量选择和模型拟合。与基于切片逆回归(SIR)(Li,1991)的现有变量选择方法相比,SODA既不需要线性条件也不需要恒定方差条件,因此更加稳健。我们的理论分析确立了SODA在高维设置下的变量选择一致性,我们的模拟研究以及实际数据应用表明SODA在处理逻辑模型和一般指数模型中的非高斯设计矩阵方面具有卓越性能。