Roy Pourab, Fine Jason P, Kosorok Michael R
US Food and Drug Administration (This work was done prior to the author joining the FDA and does not represent the official position of the FDA).
Department of Biostatistics, University of North Carolina at Chapel Hill.
Scand Stat Theory Appl. 2022 Jun;49(2):525-541. doi: 10.1111/sjos.12526. Epub 2021 Mar 16.
In prevalent cohort studies where subjects are recruited at a cross-section, the time to an event may be subject to length-biased sampling, with the observed data being either the forward recurrence time, or the backward recurrence time, or their sum. In the regression setting, assuming a semiparametric accelerated failure time model for the underlying event time, where the intercept parameter is absorbed into the nuisance parameter, it has been shown that the model remains invariant under these observed data set-ups and can be fitted using standard methodology for accelerated failure time model estimation, ignoring the length-bias. However, the efficiency of these estimators is unclear, owing to the fact that the observed covariate distribution, which is also length-biased, may contain information about the regression parameter in the accelerated life model. We demonstrate that if the true covariate distribution is completely unspecified, then the naive estimator based on the conditional likelihood given the covariates is fully efficient for the slope.
在横断面招募受试者的流行队列研究中,事件发生时间可能会受到长度偏倚抽样的影响,观察到的数据可能是向前复发时间、向后复发时间或它们的总和。在回归设定中,假设潜在事件时间的半参数加速失效时间模型,其中截距参数被纳入干扰参数,已经表明该模型在这些观察到的数据设置下保持不变,并且可以使用加速失效时间模型估计的标准方法进行拟合,而忽略长度偏倚。然而,由于观察到的协变量分布也是长度偏倚的,可能包含加速寿命模型中回归参数的信息,这些估计量的效率尚不清楚。我们证明,如果真实的协变量分布完全未指定,那么基于给定协变量的条件似然的朴素估计量对于斜率是完全有效的。