DiRienzo Albert Gregory
Department of Epidemiology and Biostatistics, University at Albany ' SUNY, Rensselaer, New York, 12144, U.S.A.
Biometrics. 2016 Jun;72(2):452-62. doi: 10.1111/biom.12420. Epub 2015 Sep 27.
A new objective methodology is proposed to select the parsimonious set of important covariates that are associated with a censored outcome variable Y; the method simplifies to accommodate uncensored outcomes. Covariate selection proceeds in an iterated forward manner and is controlled by the pre-chosen upper bound for the number of covariates to be selected and the global false selection rate and level. A sequence of working regression models for the event (Y≤y) given a covariate set is fit among subjects not censored before y and the corresponding process (through y) of conditional prediction error estimated; the direction and magnitude of covariate effects can arbitrarily change with y. The newly proposed adequacy measure for the covariate set is the slope coefficient resulting from a regression (with no intercept) between the baseline prediction error process for the intercept-only model and that process corresponding to the covariate set. Under quite general conditions on the censoring variable, the methods are shown to asymptotically control the false selection rate at the nominal level while consistently ranking covariate sets which permits recruitment of all important covariates from those available with probability tending to 1. A simulation study confirms these analytical results and compares the proposed methods to recent competitors. Two real data illustrations are provided.
本文提出了一种新的客观方法,用于选择与删失结局变量Y相关的简约重要协变量集;该方法经过简化可适用于无删失的结局。协变量选择以迭代向前的方式进行,并由预先选定的要选择的协变量数量上限以及全局错误选择率和水平控制。对于给定协变量集的事件(Y≤y),在y之前未删失的受试者中拟合一系列工作回归模型,并估计相应的(通过y)条件预测误差过程;协变量效应的方向和大小可能随y任意变化。新提出的协变量集充分性度量是仅含截距模型的基线预测误差过程与协变量集对应的过程之间的回归(无截距)所得的斜率系数。在删失变量的相当一般条件下,这些方法被证明能在名义水平上渐近控制错误选择率,同时一致地对协变量集进行排序,这使得能够从可用的协变量中以概率趋于1的方式纳入所有重要协变量。一项模拟研究证实了这些分析结果,并将所提出的方法与最近的竞争对手进行了比较。还提供了两个实际数据示例。