Weinberg C R, Moledor E S, Umbach D M, Sandler D P
Statistics and Biomathematics Branch, National Institute of Environmental Health Sciences, Research Triangle Park. NC 27709, USA.
Epidemiology. 1996 Sep;7(5):490-7.
In reconstructing exposure histories needed to calculate cumulative exposures, gaps often occur. Our investigation was motivated by case-control studies of residential radon exposure and lung cancer, where half or more of the targeted homes may not be measurable. Investigators have adopted various schemes for imputing exposures for such gaps. We first undertook simulations to assess the performance of five such methods under an excess relative risk model, in the presence of random missingness and under assumed independence among the true exposure levels for different epochs of exposure (houses). Assuming no other source of measurement error, one of the methods performed without bias and with coverage of nominally 95% confidence intervals that was close to 95%. This method assigns to the missing residences the arithmetic mean across all measured control residences. We show that its good properties can be explained by the fact that this approach produces approximate "Berkson errors." To take advantage of predictive information that might exist about the missing epochs of exposure, one might prefer to carry out the imputations within strata. In further simulations, we asked whether the method would still perform well if imputations were carried out within many strata. It does, and much of the lost statistical power/precision can be recovered if the stratification system is moderately predictive of the missing exposures. Thus, observed control mean imputation provides a way to impute missing exposures without corrupting the study's validity; and stratifying the imputations can enhance precision. The technique is applicable in other settings where exposure histories contain gaps.
在重建计算累积暴露量所需的暴露史时,常常会出现缺口。我们的调查是受住宅氡暴露与肺癌病例对照研究的推动,在这类研究中,半数或更多的目标房屋可能无法测量。研究人员针对此类缺口采用了各种暴露量估算方案。我们首先进行了模拟,以评估在超额相对风险模型下、存在随机缺失以及假定不同暴露时期(房屋)的真实暴露水平相互独立的情况下,五种此类方法的性能。假设不存在其他测量误差来源,其中一种方法无偏差,且名义上95%置信区间的覆盖率接近95%。该方法将所有测量的对照住宅的算术平均值赋予缺失的住宅。我们表明,其良好性能可由该方法产生近似“伯克森误差”这一事实来解释。为了利用可能存在的关于缺失暴露时期的预测信息,人们可能更倾向于在分层内进行估算。在进一步的模拟中,我们询问如果在许多分层内进行估算,该方法是否仍能良好运行。结果表明它可以,并且如果分层系统对缺失暴露有适度的预测能力,那么许多损失的统计效力/精度可以恢复。因此,观察到的对照均值估算提供了一种估算缺失暴露量的方法,而不会破坏研究的有效性;并且对估算进行分层可以提高精度。该技术适用于暴露史存在缺口的其他情况。