Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland.
Biometrics. 2022 Mar;78(1):179-191. doi: 10.1111/biom.13413. Epub 2020 Dec 18.
We study the efficiency of covariate-specific estimates of pure risk (one minus the survival function) when some covariates are only available for case-control samples nested in a cohort. We focus on the semiparametric additive hazards model in which the hazard function equals a baseline hazard plus a linear combination of covariates with either time-varying or time-invariant coefficients. A published approach uses the design-based inclusion probabilities to reweight the nested case-control data. We obtain more efficient estimates of pure risks by calibrating the design weights to data available in the entire cohort, for both time-varying and time-invariant covariate coefficients. We develop explicit variance formulas for the weight-calibrated estimates based on influence functions. Simulations show the improvement in precision by using weight calibration and confirm the consistency of variance estimators and the validity of inference based on asymptotic normality. Examples are provided using data from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial Study (PLCO).
我们研究了在某些协变量仅可用于嵌套在队列中的病例对照样本的情况下,协变量特异性纯风险(生存函数的倒数)估计的效率。我们专注于半参数加性风险模型,其中风险函数等于基线风险加上协变量的线性组合,协变量的系数随时间变化或不变。已发表的方法使用基于设计的纳入概率对嵌套病例对照数据进行重新加权。我们通过将设计权重校准到整个队列中可用的数据,为随时间变化和随时间不变的协变量系数,获得了更有效的纯风险估计。我们基于影响函数为权重校准的估计值开发了显式方差公式。模拟结果表明,使用权重校准可以提高精度,并确认方差估计量的一致性以及基于渐近正态性的推断的有效性。使用来自前列腺癌、肺癌、结直肠癌和卵巢癌筛查试验研究(PLCO)的数据提供了示例。