Liu Yi, Chen Xiaolin, Li Gang
School of Mathematical Sciences, Ocean University of China, Qingdao, China.
School of Statistics, Qufu Normal University, Qufu, China.
Stat Methods Med Res. 2020 Jun;29(6):1499-1513. doi: 10.1177/0962280219864710. Epub 2019 Jul 30.
In an ultra-high dimensional setting with a huge number of covariates, variable screening is useful for dimension reduction before applying a more refined method for model selection and statistical analysis. This paper proposes a new sure joint screening procedure for right-censored time-to-event data based on a sparsity-restricted semiparametric accelerated failure time model. Our method, referred to as Buckley-James assisted sure screening (BJASS), consists of an initial screening step using a sparsity-restricted least-squares estimate based on a synthetic time variable and a refinement screening step using a sparsity-restricted least-squares estimate with the Buckley-James imputed event times. The refinement step may be repeated several times to obtain more stable results. We show that with any fixed number of refinement steps, the BJASS procedure retains all important variables with probability tending to 1. Simulation results are presented to illustrate its performance in comparison with some marginal screening methods. Real data examples are provided using a diffuse large-B-cell lymphoma (DLBCL) data and a breast cancer data. We have implemented the BJASS method using Matlab and made it available to readers through Github https://github.com/yiucla/BJASS .
在具有大量协变量的超高维情形下,变量筛选对于在应用更精细的模型选择和统计分析方法之前进行降维很有用。本文基于稀疏受限半参数加速失效时间模型,为右删失生存时间数据提出了一种新的确定联合筛选程序。我们的方法称为Buckley-James辅助确定筛选(BJASS),它包括一个初始筛选步骤,该步骤使用基于合成时间变量的稀疏受限最小二乘估计,以及一个细化筛选步骤,该步骤使用基于Buckley-James估计的事件时间的稀疏受限最小二乘估计。细化步骤可以重复多次以获得更稳定的结果。我们表明,对于任何固定数量的细化步骤,BJASS程序以趋于1的概率保留所有重要变量。给出了模拟结果以说明其与一些边际筛选方法相比的性能。使用弥漫性大B细胞淋巴瘤(DLBCL)数据和乳腺癌数据提供了实际数据示例。我们已经使用Matlab实现了BJASS方法,并通过Github https://github.com/yiucla/BJASS 向读者提供该方法。