Department of Mathematics and Statistics, McGill University, Montreal, Canada.
Department of Mathematics, Statistics and Computer Science, Macalester College, St.Paul, USA.
BMC Med Res Methodol. 2022 Jan 7;22(1):10. doi: 10.1186/s12874-021-01496-3.
When modelling the survival distribution of a disease for which the symptomatic progression of the associated condition is insidious, it is not always clear how to measure the failure/censoring times from some true date of disease onset. In a prevalent cohort study with follow-up, one approach for removing any potential influence from the uncertainty in the measurement of the true onset dates is through the utilization of only the residual lifetimes. As the residual lifetimes are measured from a well-defined screening date (prevalence day) to failure/censoring, these observed time durations are essentially error free. Using residual lifetime data, the nonparametric maximum likelihood estimator (NPMLE) may be used to estimate the underlying survival function. However, the resulting estimator can yield exceptionally wide confidence intervals. Alternatively, while parametric maximum likelihood estimation can yield narrower confidence intervals, it may not be robust to model misspecification. Using only right-censored residual lifetime data, we propose a stacking procedure to overcome the non-robustness of model misspecification; our proposed estimator comprises a linear combination of individual nonparametric/parametric survival function estimators, with optimal stacking weights obtained by minimizing a Brier Score loss function.
当对一种疾病的生存分布进行建模时,如果与该疾病相关的症状进展是隐匿的,那么确定如何从疾病的确切发病日期测量失效/删失时间并不总是很清楚。在一项具有随访的流行队列研究中,一种消除与真实发病日期测量不确定性相关的潜在影响的方法是仅使用残差寿命。由于残差寿命是从明确的筛查日期(流行日期)到失效/删失进行测量的,因此这些观察到的持续时间基本上是无误差的。使用残差寿命数据,可以使用非参数最大似然估计量(NPMLE)来估计潜在的生存函数。但是,由此产生的估计量可能会产生异常宽的置信区间。或者,虽然参数最大似然估计可以产生更窄的置信区间,但它可能对模型的误设定不稳健。仅使用右删失的残差寿命数据,我们提出了一种堆叠程序来克服模型误设定的不稳健性;我们提出的估计量是通过最小化 Brier 得分损失函数来获得的个体非参数/参数生存函数估计量的线性组合。