Zhang Jing, Liu Yanyan
School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, Hubei, People's Republic of China.
School of Mathematics and Statistics, Wuhan University, Wuhan, Hubei, People's Republic of China.
J Appl Stat. 2020 Jun 2;48(10):1755-1774. doi: 10.1080/02664763.2020.1772734. eCollection 2021.
For ultrahigh-dimensional data, independent feature screening has been demonstrated both theoretically and empirically to be an effective dimension reduction method with low computational demanding. Motivated by the Buckley-James method to accommodate censoring, we propose a fused Kolmogorov-Smirnov filter to screen out the irrelevant dependent variables for ultrahigh-dimensional survival data. The proposed model-free screening method can work with many types of covariates (e.g. continuous, discrete and categorical variables) and is shown to enjoy the sure independent screening property under mild regularity conditions without requiring any moment conditions on covariates. In particular, the proposed procedure can still be powerful when covariates are strongly dependent on each other. We further develop an iterative algorithm to enhance the performance of our method while dealing with the practical situations where some covariates may be marginally unrelated but jointly related to the response. We conduct extensive simulations to evaluate the finite-sample performance of the proposed method, showing that it has favourable exhibition over the existing typical methods. As an illustration, we apply the proposed method to the diffuse large-B-cell lymphoma study.
对于超高维数据,理论和实证均表明独立特征筛选是一种有效的降维方法,计算要求较低。受用于处理删失的Buckley-James方法的启发,我们提出一种融合的柯尔莫哥洛夫-斯米尔诺夫滤波器,用于筛选超高维生存数据中的无关因变量。所提出的无模型筛选方法可处理多种类型的协变量(如连续、离散和分类变量),并且在温和的正则条件下具有确定独立筛选性质,无需对协变量有任何矩条件。特别地,当协变量彼此高度相关时,所提出的方法仍然有效。我们进一步开发了一种迭代算法,以在处理某些协变量可能边缘无关但与响应联合相关的实际情况时提高我们方法的性能。我们进行了广泛的模拟以评估所提出方法的有限样本性能,结果表明它比现有的典型方法表现更优。作为示例,我们将所提出的方法应用于弥漫性大B细胞淋巴瘤研究。