Chai Hao, Zhang Qingzhao, Huang Jian, Ma Shuangge
Yale University.
Xiamen University.
Stat Sin. 2019 Apr;29(2):877-894. doi: 10.5705/ss.202016.0449.
Data with high-dimensional covariates are now commonly encountered. Compared to other types of responses, research on high-dimensional data with censored survival responses is still relatively limited, and most of the existing studies have been focused on estimation and variable selection. In this study, we consider data with a censored survival response, a set of low-dimensional covariates of main interest, and a set of high-dimensional covariates that may also affect survival. The accelerated failure time model is adopted to describe survival. The goal is to conduct inference for the effects of low-dimensional covariates, while properly accounting for the high-dimensional covariates. A penalization-based procedure is developed, and its validity is established under mild and widely adopted conditions. Simulation suggests satisfactory performance of the proposed procedure, and the analysis of two cancer genetic datasets demonstrates its practical applicability.
具有高维协变量的数据如今已很常见。与其他类型的响应相比,针对具有删失生存响应的高维数据的研究仍然相对有限,并且现有的大多数研究都集中在估计和变量选择上。在本研究中,我们考虑具有删失生存响应的数据、一组主要感兴趣的低维协变量以及一组可能也会影响生存的高维协变量。采用加速失效时间模型来描述生存情况。目标是对低维协变量的效应进行推断,同时妥善考虑高维协变量。我们开发了一种基于惩罚的方法,并在温和且广泛采用的条件下确立了其有效性。模拟结果表明所提出的方法具有令人满意的性能,对两个癌症基因数据集的分析证明了其实际适用性。