Suppr超能文献

自适应稳健回归

Adaptive Huber Regression.

作者信息

Sun Qiang, Zhou Wen-Xin, Fan Jianqing

机构信息

Department of Statistical Sciences, University of Toronto, Toronto, ON M5S 3G3, Canada.

Department of Mathematics, University of California, San Diego, La Jolla, CA 92093.

出版信息

J Am Stat Assoc. 2020;115(529):254-265. doi: 10.1080/01621459.2018.1543124. Epub 2019 Apr 22.

Abstract

Big data can easily be contaminated by outliers or contain variables with heavy-tailed distributions, which makes many conventional methods inadequate. To address this challenge, we propose the adaptive Huber regression for robust estimation and inference. The key observation is that the robustification parameter should adapt to the sample size, dimension and moments for optimal tradeoff between bias and robustness. Our theoretical framework deals with heavy-tailed distributions with bounded (1 + )-th moment for any > 0. We establish a sharp phase transition for robust estimation of regression parameters in both low and high dimensions: when ≥ 1, the estimator admits a sub-Gaussian-type deviation bound without sub-Gaussian assumptions on the data, while only a slower rate is available in the regime 0 < < 1 and the transition is smooth and optimal. In addition, we extend the methodology to allow both heavy-tailed predictors and observation noise. Simulation studies lend further support to the theory. In a genetic study of cancer cell lines that exhibit heavy-tailedness, the proposed methods are shown to be more robust and predictive.

摘要

大数据很容易受到异常值的污染,或者包含具有重尾分布的变量,这使得许多传统方法并不适用。为应对这一挑战,我们提出了用于稳健估计和推断的自适应Huber回归。关键的发现是,稳健化参数应适应样本大小、维度和矩,以便在偏差和稳健性之间实现最佳权衡。我们的理论框架处理对于任意>0具有有界(1 + )阶矩的重尾分布。我们在低维和高维中都为回归参数的稳健估计建立了一个清晰的相变:当≥1时,估计量在不对数据做次高斯假设的情况下具有次高斯型偏差界,而在0 << 1的情况下只有较慢的速率,并且这种转变是平滑且最优的。此外,我们扩展了该方法以同时允许重尾预测变量和观测噪声。模拟研究进一步支持了该理论。在一项对表现出重尾性的癌细胞系的遗传学研究中,所提出的方法被证明更稳健且具有预测性。

相似文献

1
Adaptive Huber Regression.自适应稳健回归
J Am Stat Assoc. 2020;115(529):254-265. doi: 10.1080/01621459.2018.1543124. Epub 2019 Apr 22.
2
Sparse Reduced Rank Huber Regression in High Dimensions.高维稀疏降秩Huber回归
J Am Stat Assoc. 2023;118(544):2383-2393. doi: 10.1080/01621459.2022.2050243. Epub 2022 Apr 15.
4
Adaptive Huber Regression on Markov-dependent Data.基于马尔可夫相关数据的自适应Huber回归
Stoch Process Their Appl. 2022 Aug;150:802-818. doi: 10.1016/j.spa.2019.09.004. Epub 2019 Sep 25.
8
Minimax Optimal Bandits for Heavy Tail Rewards.重尾奖励的极小极大最优策略
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):5280-5294. doi: 10.1109/TNNLS.2022.3203035. Epub 2024 Apr 4.

引用本文的文献

3
Robust convex biclustering with a tuning-free method.一种无需调优方法的稳健凸双聚类
J Appl Stat. 2024 Jun 17;52(2):271-286. doi: 10.1080/02664763.2024.2367143. eCollection 2025.
4
Are Latent Factor Regression and Sparse Regression Adequate?潜在因子回归和稀疏回归是否足够?
J Am Stat Assoc. 2024;119(546):1076-1088. doi: 10.1080/01621459.2023.2169700. Epub 2023 Feb 14.
7
Sparse Reduced Rank Huber Regression in High Dimensions.高维稀疏降秩Huber回归
J Am Stat Assoc. 2023;118(544):2383-2393. doi: 10.1080/01621459.2022.2050243. Epub 2022 Apr 15.
8
Adaptive Huber Regression on Markov-dependent Data.基于马尔可夫相关数据的自适应Huber回归
Stoch Process Their Appl. 2022 Aug;150:802-818. doi: 10.1016/j.spa.2019.09.004. Epub 2019 Sep 25.
10
Meta-Analyzing Multiple Omics Data With Robust Variable Selection.通过稳健变量选择对多组学数据进行Meta分析
Front Genet. 2021 Jul 5;12:656826. doi: 10.3389/fgene.2021.656826. eCollection 2021.

本文引用的文献

6
A High-Dimensional Nonparametric Multivariate Test for Mean Vector.均值向量的高维非参数多元检验
J Am Stat Assoc. 2015;110(512):1658-1669. doi: 10.1080/01621459.2014.988215. Epub 2016 Jan 15.
9
ADAPTIVE ROBUST VARIABLE SELECTION.自适应鲁棒变量选择
Ann Stat. 2014 Feb 1;42(1):324-351. doi: 10.1214/13-AOS1191.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验