Suppr超能文献

高维生存数据的删失秩独立性筛选

Censored Rank Independence Screening for High-dimensional Survival Data.

作者信息

Song Rui, Lu Wenbin, Ma Shuangge, Jeng X Jessie

机构信息

Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, USA.

Division of Biostatistics, School of Public Health, Yale University, New Haven, Connecticut 06510, USA.

出版信息

Biometrika. 2014;101(4):799-814. doi: 10.1093/biomet/asu047.

Abstract

In modern statistical applications, the dimension of covariates can be much larger than the sample size. In the context of linear models, correlation screening (Fan and Lv, 2008) has been shown to reduce the dimension of such data effectively while achieving the sure screening property, i.e., all of the active variables can be retained with high probability. However, screening based on the Pearson correlation does not perform well when applied to contaminated covariates and/or censored outcomes. In this paper, we study censored rank independence screening of high-dimensional survival data. The proposed method is robust to predictors that contain outliers, works for a general class of survival models, and enjoys the sure screening property. Simulations and an analysis of real data demonstrate that the proposed method performs competitively on survival data sets of moderate size and high-dimensional predictors, even when these are contaminated.

摘要

在现代统计应用中,协变量的维度可能比样本量要大得多。在线性模型的背景下,相关筛选(范剑青和吕晓玲,2008)已被证明能有效降低此类数据的维度,同时实现确定筛选性质,即所有活跃变量都能以高概率被保留。然而,基于皮尔逊相关的筛选应用于受污染的协变量和/或删失结局时效果不佳。在本文中,我们研究高维生存数据的删失秩独立性筛选。所提出的方法对包含异常值的预测变量具有稳健性,适用于一般类别的生存模型,并且具有确定筛选性质。模拟和实际数据分析表明,所提出的方法在中等规模和高维预测变量的生存数据集上表现出色,即使这些数据受到污染。

相似文献

1
Censored Rank Independence Screening for High-dimensional Survival Data.
Biometrika. 2014;101(4):799-814. doi: 10.1093/biomet/asu047.
2
Feature Screening via Distance Correlation Learning.
J Am Stat Assoc. 2012 Jul 1;107(499):1129-1139. doi: 10.1080/01621459.2012.695654.
3
Nonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.
J Am Stat Assoc. 2011 Jun;106(494):544-557. doi: 10.1198/jasa.2011.tm09779.
4
Censored cumulative residual independent screening for ultrahigh-dimensional survival data.
Lifetime Data Anal. 2018 Apr;24(2):273-292. doi: 10.1007/s10985-017-9395-2. Epub 2017 May 26.
5
Nonparametric screening and feature selection for ultrahigh-dimensional Case II interval-censored failure time data.
Biom J. 2020 Dec;62(8):1909-1925. doi: 10.1002/bimj.201900154. Epub 2020 Jul 16.
6
Model-Free Conditional Independence Feature Screening For Ultrahigh Dimensional Data.
Sci China Math. 2017 Mar;60(3):551-568. doi: 10.1007/s11425-016-0186-8. Epub 2016 Dec 29.
7
Feature Screening for High-Dimensional Variable Selection in Generalized Linear Models.
Entropy (Basel). 2023 May 26;25(6):851. doi: 10.3390/e25060851.
8
Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis.
J Am Stat Assoc. 2015 Jun 1;110(510):630-641. doi: 10.1080/01621459.2014.920256.
9
Feature Screening in Ultrahigh Dimensional Cox's Model.
Stat Sin. 2016;26:881-901. doi: 10.5705/ss.2014.171.
10
Regularized Quantile Regression and Robust Feature Screening for Single Index Models.
Stat Sin. 2016 Jan;26(1):69-95. doi: 10.5705/ss.2014.049.

引用本文的文献

3
EFFICIENT ESTIMATION OF THE MAXIMAL ASSOCIATION BETWEEN MULTIPLE PREDICTORS AND A SURVIVAL OUTCOME.
Ann Stat. 2023 Oct;51(5):1965-1988. doi: 10.1214/23-aos2313. Epub 2023 Dec 14.
4
AN OMNIBUS TEST FOR DETECTION OF SUBGROUP TREATMENT EFFECTS VIA DATA PARTITIONING.
Ann Appl Stat. 2022 Dec;16(4):2266-2278. doi: 10.1214/21-AOAS1589. Epub 2022 Sep 26.
5
Sure Joint Screening for High Dimensional Cox's Proportional Hazards Model Under the Case-Cohort Design.
J Comput Biol. 2023 Jun;30(6):663-677. doi: 10.1089/cmb.2022.0416. Epub 2023 May 3.
6
Gene Screening in High-Throughput Right-Censored Lung Cancer Data.
Onco (Basel). 2022 Dec;2(4):305-318. doi: 10.3390/onco2040017. Epub 2022 Oct 17.
7
Unified model-free interaction screening via CV-entropy filter.
Comput Stat Data Anal. 2023 Apr;180. doi: 10.1016/j.csda.2022.107684. Epub 2022 Dec 28.
8
Feature screening for case-cohort studies with failure time outcome.
Scand Stat Theory Appl. 2021 Mar;48(1):349-370. doi: 10.1111/sjos.12503. Epub 2020 Nov 16.
9
Sparse group variable selection for gene-environment interactions in the longitudinal study.
Genet Epidemiol. 2022 Jul;46(5-6):317-340. doi: 10.1002/gepi.22461. Epub 2022 Jun 29.
10
Ultrahigh-dimensional sufficient dimension reduction for censored data with measurement error in covariates.
J Appl Stat. 2020 Dec 8;49(5):1154-1178. doi: 10.1080/02664763.2020.1856352. eCollection 2022.

本文引用的文献

1
Principled sure independence screening for Cox models with ultra-high-dimensional covariates.
J Multivar Anal. 2012 Feb 1;105(1):397-411. doi: 10.1016/j.jmva.2011.08.002.
2
Novel rank-based approaches for discovery and replication in genome-wide association studies.
Genetics. 2011 Sep;189(1):329-40. doi: 10.1534/genetics.111.130542. Epub 2011 Jul 29.
3
Penalized Estimating Functions and Variable Selection in Semiparametric Regression Models.
J Am Stat Assoc. 2008 Jun 1;103(482):672-680. doi: 10.1198/016214508000000184.
4
Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.
J R Stat Soc Series B Stat Methodol. 2008 Nov;70(5):903. doi: 10.1111/j.1467-9868.2008.00674.x.
5
Univariate shrinkage in the cox model for high dimensional data.
Stat Appl Genet Mol Biol. 2009;8(1):Article21. doi: 10.2202/1544-6115.1438. Epub 2009 Apr 14.
7
Cross-validated Cox regression on microarray gene expression data.
Stat Med. 2006 Sep 30;25(18):3201-16. doi: 10.1002/sim.2353.
8
Gene expression profiling predicts clinical outcome of breast cancer.
Nature. 2002 Jan 31;415(6871):530-6. doi: 10.1038/415530a.
10
The lasso method for variable selection in the Cox model.
Stat Med. 1997 Feb 28;16(4):385-95. doi: 10.1002/(sici)1097-0258(19970228)16:4<385::aid-sim380>3.0.co;2-3.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验