Bradic Jelena, Fan Jianqing, Wang Weiwei
Department of Operations Research and Financial Engineering, Princeton University, Princeton, USA.
J R Stat Soc Series B Stat Methodol. 2011 Jun;73(3):325-349. doi: 10.1111/j.1467-9868.2010.00764.x.
In high-dimensional model selection problems, penalized least-square approaches have been extensively used. This paper addresses the question of both robustness and efficiency of penalized model selection methods, and proposes a data-driven weighted linear combination of convex loss functions, together with weighted L(1)-penalty. It is completely data-adaptive and does not require prior knowledge of the error distribution. The weighted L(1)-penalty is used both to ensure the convexity of the penalty term and to ameliorate the bias caused by the L(1)-penalty. In the setting with dimensionality much larger than the sample size, we establish a strong oracle property of the proposed method that possesses both the model selection consistency and estimation efficiency for the true non-zero coefficients. As specific examples, we introduce a robust method of composite L1-L2, and optimal composite quantile method and evaluate their performance in both simulated and real data examples.
在高维模型选择问题中,惩罚最小二乘法已被广泛使用。本文探讨了惩罚模型选择方法的稳健性和效率问题,并提出了一种数据驱动的凸损失函数加权线性组合,以及加权L(1)惩罚。它完全是数据自适应的,不需要误差分布的先验知识。加权L(1)惩罚既用于确保惩罚项的凸性,又用于减轻L(1)惩罚引起的偏差。在维度远大于样本量的情况下,我们建立了所提方法的强似然性性质,该方法对于真实的非零系数同时具有模型选择一致性和估计效率。作为具体例子,我们引入了一种稳健的复合L1-L2方法和最优复合分位数方法,并在模拟和实际数据例子中评估了它们的性能。