Suppr超能文献

基于划分的超高维变量筛选

Partition-based ultrahigh-dimensional variable screening.

作者信息

Kang Jian, Hong Hyokyoung G, Li Y I

机构信息

Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, Michigan 48109, U.S.A.

Department of Statistics and Probability, Michigan State University, 619 Red Cedar Rd, East Lansing, Michigan 48823, U.S.A.

出版信息

Biometrika. 2017 Nov;104(4):785-800. doi: 10.1093/biomet/asx052. Epub 2017 Oct 9.

Abstract

Traditional variable selection methods are compromised by overlooking useful information on covariates with similar functionality or spatial proximity, and by treating each covariate independently. Leveraging prior grouping information on covariates, we propose partition-based screening methods for ultrahigh-dimensional variables in the framework of generalized linear models. We show that partition-based screening exhibits the sure screening property with a vanishing false selection rate, and we propose a data-driven partition screening framework with unavailable or unreliable prior knowledge on covariate grouping and investigate its theoretical properties. We consider two special cases: correlation-guided partitioning and spatial location- guided partitioning. In the absence of a single partition, we propose a theoretically justified strategy for combining statistics from various partitioning methods. The utility of the proposed methods is demonstrated via simulation and analysis of functional neuroimaging data.

摘要

传统的变量选择方法存在缺陷,因为它忽略了具有相似功能或空间邻近性的协变量的有用信息,并且独立地处理每个协变量。利用协变量的先验分组信息,我们在广义线性模型框架下提出了基于划分的超高维变量筛选方法。我们表明,基于划分的筛选具有确定筛选性质且错误选择率趋于零,并且我们提出了一个在协变量分组的先验知识不可用或不可靠时的数据驱动划分筛选框架,并研究了其理论性质。我们考虑两种特殊情况:相关引导划分和空间位置引导划分。在没有单一划分的情况下,我们提出了一种理论上合理的策略来组合来自各种划分方法的统计量。通过对功能性神经成像数据的模拟和分析,证明了所提出方法的实用性。

相似文献

1
Partition-based ultrahigh-dimensional variable screening.基于划分的超高维变量筛选
Biometrika. 2017 Nov;104(4):785-800. doi: 10.1093/biomet/asx052. Epub 2017 Oct 9.
5
Variable Selection via Partial Correlation.通过偏相关进行变量选择。
Stat Sin. 2017 Jul;27(3):983-996. doi: 10.5705/ss.202015.0473.
7
Feature Screening via Distance Correlation Learning.通过距离相关学习进行特征筛选
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
9
Group Feature Screening via the F Statistic.通过F统计量进行组特征筛选。
Commun Stat Simul Comput. 2022;51(4):1921-1931. doi: 10.1080/03610918.2019.1691223. Epub 2019 Nov 26.

引用本文的文献

2
High-Dimensional Survival Analysis: Methods and Applications.高维生存分析:方法与应用
Annu Rev Stat Appl. 2023 Mar;10(1):25-49. doi: 10.1146/annurev-statistics-032921-022127. Epub 2022 Oct 6.

本文引用的文献

1
Generalized Scalar-on-Image Regression Models via Total Variation.基于全变差的广义图像上标量回归模型
J Am Stat Assoc. 2017;112(519):1156-1168. doi: 10.1080/01621459.2016.1194846. Epub 2017 Apr 13.
2
Conditional Sure Independence Screening.条件确定独立性筛选
J Am Stat Assoc. 2016;111(515):1266-1277. doi: 10.1080/01621459.2015.1092974. Epub 2016 Oct 18.
3
Conditional screening for ultra-high dimensional covariates with survival outcomes.基于生存结局的超高维协变量条件筛选
Lifetime Data Anal. 2018 Jan;24(1):45-71. doi: 10.1007/s10985-016-9387-7. Epub 2016 Dec 8.
7
COVARIANCE ASSISTED SCREENING AND ESTIMATION.协方差辅助筛选与估计
Ann Stat. 2014 Nov 1;42(6):2202-2242. doi: 10.1214/14-AOS1243.
8
A review of multivariate analyses in imaging genetics.影像遗传学中的多变量分析综述。
Front Neuroinform. 2014 Mar 26;8:29. doi: 10.3389/fninf.2014.00029. eCollection 2014.
10
Model-Free Feature Screening for Ultrahigh Dimensional Data.超高维数据的无模型特征筛选
J Am Stat Assoc. 2011 Jan 1;106(496):1464-1475. doi: 10.1198/jasa.2011.tm10563. Epub 2012 Jan 24.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验