协方差辅助筛选与估计

COVARIANCE ASSISTED SCREENING AND ESTIMATION.

作者信息

Ke By Tracy, Jin Jiashun, Fan Jianqing

机构信息

Princeton University and Carnegie Mellon University.

出版信息

Ann Stat. 2014 Nov 1;42(6):2202-2242. doi: 10.1214/14-AOS1243.

DOI:10.1214/14-AOS1243

PMID:25541567

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4274608/

Abstract

Consider a linear model = β + , where = and ~ (0, ). The vector β is unknown and it is of interest to separate its nonzero coordinates from the zero ones (i.e., variable selection). Motivated by examples in long-memory time series (Fan and Yao, 2003) and the change-point problem (Bhattacharya, 1994), we are primarily interested in the case where the Gram matrix = ' is but by a finite order linear filter. We focus on the regime where signals are both so that successful variable selection is very challenging but is still possible. We approach this problem by a new procedure called the (CASE). CASE first uses a linear filtering to reduce the original setting to a new regression model where the corresponding Gram (covariance) matrix is sparse. The new covariance matrix induces a sparse graph, which guides us to conduct multivariate screening without visiting all the submodels. By interacting with the signal sparsity, the graph enables us to decompose the original problem into many separated small-size subproblems (if only we know where they are!). Linear filtering also induces a so-called problem of , which can be overcome by the newly introduced technique. Together, these give rise to CASE, which is a two-stage Screen and Clean (Fan and Song, 2010; Wasserman and Roeder, 2009) procedure, where we first identify candidates of these submodels by , and then re-examine each candidate to remove false positives. For any procedure β̂ for variable selection, we measure the performance by the minimax Hamming distance between the sign vectors of β̂ and β. We show that in a broad class of situations where the Gram matrix is non-sparse but sparsifiable, CASE achieves the optimal rate of convergence. The results are successfully applied to long-memory time series and the change-point model.

摘要

考虑一个线性模型(y = X\beta+\epsilon)，其中(X = (x_1,\cdots,x_p))且(\epsilon\sim N(0,\sigma^2 I))。向量(\beta)是未知的，将其非零坐标与零坐标区分开（即变量选择）是我们感兴趣的问题。受长记忆时间序列中的例子（范剑青和姚期智，2003年）以及变点问题（巴塔查里亚，1994年）的启发，我们主要关注的情况是，Gram矩阵(G = X'X)是满秩的，但通过一个有限阶线性滤波器。我们关注信号既稀疏又密集的情况，使得成功的变量选择非常具有挑战性，但仍然是可能的。我们通过一种称为CASE（筛选与清理）的新方法来解决这个问题。CASE首先使用线性滤波将原始设置简化为一个新的回归模型，其中相应的Gram（协方差）矩阵是稀疏的。新的协方差矩阵诱导出一个稀疏图，它引导我们进行多变量筛选，而无需遍历所有子模型。通过与信号稀疏性相互作用，该图使我们能够将原始问题分解为许多分离的小尺寸子问题（前提是我们知道它们在哪里！）。线性滤波还会引发一个所谓的偏差问题，这可以通过新引入的重加权技术来克服。综合起来，这些产生了CASE，它是一种两阶段的筛选与清理（范剑青和宋立新，2010年；瓦瑟曼和罗德，2009年）方法，其中我们首先通过筛选识别这些子模型的候选者，然后重新检查每个候选者以去除误报。对于任何用于变量选择的过程(\hat{\beta})，我们通过(\hat{\beta})和(\beta)的符号向量之间的极小极大汉明距离来衡量性能。我们表明，在Gram矩阵非稀疏但可稀疏化的广泛情况下，CASE实现了最优收敛速度。这些结果成功地应用于长记忆时间序列和变点模型。

相似文献

COVARIANCE ASSISTED SCREENING AND ESTIMATION.

Ann Stat. 2014 Nov 1;42(6):2202-2242. doi: 10.1214/14-AOS1243.

Optimal Estimation and Rank Detection for Sparse Spiked Covariance Matrices.

Probab Theory Relat Fields. 2015 Apr 1;161(3-4):781-815. doi: 10.1007/s00440-014-0562-z.

Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.

Ann Stat. 2009;37(6B):4254-4278. doi: 10.1214/09-AOS720.

Sparse estimation of a covariance matrix.

Biometrika. 2011 Dec;98(4):807-820. doi: 10.1093/biomet/asr054.

Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso.

J Mach Learn Res. 2012 Mar 1;13:781-794.

Large Covariance Estimation by Thresholding Principal Orthogonal Complements.

J R Stat Soc Series B Stat Methodol. 2013 Sep 1;75(4). doi: 10.1111/rssb.12016.

Feature selection by higher criticism thresholding achieves the optimal phase diagram.

Philos Trans A Math Phys Eng Sci. 2009 Nov 13;367(1906):4449-70. doi: 10.1098/rsta.2009.0129.

Estimation of the false discovery proportion with unknown dependence.

J R Stat Soc Series B Stat Methodol. 2017 Sep;79(4):1143-1164. doi: 10.1111/rssb.12204. Epub 2016 Sep 26.

Weak signals in high-dimension regression: detection, estimation and prediction.

Appl Stoch Models Bus Ind. 2019 Mar-Apr;35(2):283-298. doi: 10.1002/asmb.2340. Epub 2018 May 25.

Sparse Group Lasso: Optimal Sample Complexity, Convergence Rate, and Statistical Inference.

IEEE Trans Inf Theory. 2022 Sep;68(9):5975-6002. doi: 10.1109/tit.2022.3175455. Epub 2022 May 16.

引用本文的文献

Inference for sparse linear regression based on the leave-one-covariate-out solution path.

Commun Stat Theory Methods. 2023;52(18):6640-6657. doi: 10.1080/03610926.2022.2032171. Epub 2022 Feb 2.

The Kendall interaction filter for variable interaction screening in high dimensional classification problems.

J Appl Stat. 2022 Feb 4;50(7):1496-1514. doi: 10.1080/02664763.2022.2031125. eCollection 2023.

Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems.

J Am Stat Assoc. 2022;117(539):1516-1529. doi: 10.1080/01621459.2020.1864380. Epub 2021 Feb 10.

Multiclass linear discriminant analysis with ultrahigh-dimensional features.

Biometrics. 2019 Dec;75(4):1086-1097. doi: 10.1111/biom.13065. Epub 2019 Jun 18.

Bayesian block-diagonal variable selection and model averaging.

Biometrika. 2017 Jun;104(2):343-359. Epub 2017 Apr 24.

Partition-based ultrahigh-dimensional variable screening.

Biometrika. 2017 Nov;104(4):785-800. doi: 10.1093/biomet/asx052. Epub 2017 Oct 9.

Challenges of Big Data Analysis.

Natl Sci Rev. 2014 Jun;1(2):293-314. doi: 10.1093/nsr/nwt032.

本文引用的文献

STRONG ORACLE OPTIMALITY OF FOLDED CONCAVE PENALIZED ESTIMATION.

Ann Stat. 2014 Jun;42(3):819-849. doi: 10.1214/13-aos1198.

THE SCREENING AND RANKING ALGORITHM TO DETECT DNA COPY NUMBER VARIATIONS.

Ann Appl Stat. 2012 Sep;6(3):1306-1326. doi: 10.1214/12-AOAS539SUPP.

Detecting simultaneous changepoints in multiple sequences.

Biometrika. 2010 Sep;97(3):631-645. doi: 10.1093/biomet/asq025. Epub 2010 Jun 16.

Variance estimation using refitted cross-validation in ultrahigh dimensional regression.

J R Stat Soc Series B Stat Methodol. 2012 Jan 1;74(1):37-65. doi: 10.1111/j.1467-9868.2011.01005.x.

HIGH DIMENSIONAL VARIABLE SELECTION.

Ann Stat. 2009 Jan 1;37(5A):2178-2201. doi: 10.1214/08-aos646.

Higher criticism thresholding: Optimal feature selection when useful features are rare and weak.

Proc Natl Acad Sci U S A. 2008 Sep 30;105(39):14790-5. doi: 10.1073/pnas.0807471105. Epub 2008 Sep 24.

Sparse inverse covariance estimation with the graphical lasso.

Biostatistics. 2008 Jul;9(3):432-41. doi: 10.1093/biostatistics/kxm045. Epub 2007 Dec 12.

Spatial smoothing and hot spot detection for CGH data using the fused lasso.

Biostatistics. 2008 Jan;9(1):18-29. doi: 10.1093/biostatistics/kxm013. Epub 2007 May 18.

Why most published research findings are false.

PLoS Med. 2005 Aug;2(8):e124. doi: 10.1371/journal.pmed.0020124. Epub 2005 Aug 30.

Circular binary segmentation for the analysis of array-based DNA copy number data.

Biostatistics. 2004 Oct;5(4):557-72. doi: 10.1093/biostatistics/kxh008.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

协方差辅助筛选与估计

COVARIANCE ASSISTED SCREENING AND ESTIMATION.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献