Tan Kean Ming, Sun Qiang, Witten Daniela
Department of Statistics, University of Michigan, Ann Arbor, MI.
Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada.
J Am Stat Assoc. 2023;118(544):2383-2393. doi: 10.1080/01621459.2022.2050243. Epub 2022 Apr 15.
We propose a sparse reduced rank Huber regression for analyzing large and complex high-dimensional data with heavy-tailed random noise. The proposed method is based on a convex relaxation of a rank- and sparsity-constrained nonconvex optimization problem, which is then solved using a block coordinate descent and an alternating direction method of multipliers algorithm. We establish nonasymptotic estimation error bounds under both Frobenius and nuclear norms in the high-dimensional setting. This is a major contribution over existing results in reduced rank regression, which mainly focus on rank selection and prediction consistency. Our theoretical results quantify the tradeoff between heavy-tailedness of the random noise and statistical bias. For random noise with bounded th moment with , the rate of convergence is a function of , and is slower than the sub-Gaussian-type deviation bounds; for random noise with bounded second moment, we obtain a rate of convergence as if sub-Gaussian noise were assumed. We illustrate the performance of the proposed method via extensive numerical studies and a data application. Supplementary materials for this article are available online.
我们提出了一种稀疏降秩Huber回归方法,用于分析具有重尾随机噪声的大型复杂高维数据。所提出的方法基于对秩和稀疏性约束的非凸优化问题的凸松弛,然后使用块坐标下降法和乘子交替方向法进行求解。在高维情况下,我们建立了Frobenius范数和核范数下的非渐近估计误差界。这是相对于现有降秩回归结果的一个主要贡献,现有结果主要集中在秩选择和预测一致性上。我们的理论结果量化了随机噪声的重尾性与统计偏差之间的权衡。对于具有有界第 阶矩( )的随机噪声,收敛速度是 的函数,并且比重尾高斯型偏差界慢;对于具有有界二阶矩的随机噪声,我们得到的收敛速度就好像假设是重尾高斯噪声一样。我们通过广泛的数值研究和一个数据应用来说明所提出方法的性能。本文的补充材料可在线获取。