Consulting in the Public Interest (CIPI), Lambertville, NJ, USA.
Environ Health. 2013 Aug 7;12:62. doi: 10.1186/1476-069X-12-62.
Environmental epidemiology, when focused on the life course of exposure to a specific pollutant, requires historical exposure estimates that are difficult to obtain for the full time period due to gaps in the historical record, especially in earlier years. We show that these gaps can be filled by applying multiple imputation methods to a formal risk equation that incorporates lifetime exposure. We also address challenges that arise, including choice of imputation method, potential bias in regression coefficients, and uncertainty in age-at-exposure sensitivities.
During time periods when parameters needed in the risk equation are missing for an individual, the parameters are filled by an imputation model using group level information or interpolation. A random component is added to match the variance found in the estimates for study subjects not needing imputation. The process is repeated to obtain multiple data sets, whose regressions against health data can be combined statistically to develop confidence limits using Rubin's rules to account for the uncertainty introduced by the imputations. To test for possible recall bias between cases and controls, which can occur when historical residence location is obtained by interview, and which can lead to misclassification of imputed exposure by disease status, we introduce an "incompleteness index," equal to the percentage of dose imputed (PDI) for a subject. "Effective doses" can be computed using different functional dependencies of relative risk on age of exposure, allowing intercomparison of different risk models. To illustrate our approach, we quantify lifetime exposure (dose) from traffic air pollution in an established case-control study on Long Island, New York, where considerable in-migration occurred over a period of many decades.
The major result is the described approach to imputation. The illustrative example revealed potential recall bias, suggesting that regressions against health data should be done as a function of PDI to check for consistency of results. The 1% of study subjects who lived for long durations near heavily trafficked intersections, had very high cumulative exposures. Thus, imputation methods must be designed to reproduce non-standard distributions.
Our approach meets a number of methodological challenges to extending historical exposure reconstruction over a lifetime and shows promise for environmental epidemiology. Application to assessment of breast cancer risks will be reported in a subsequent manuscript.
当环境流行病学专注于暴露于特定污染物的生命历程时,由于历史记录中的空白,特别是在早期,很难获得整个时间段的历史暴露估计值。我们表明,可以通过将包含终生暴露的正式风险方程应用于多个插补方法来填补这些空白。我们还解决了出现的挑战,包括插补方法的选择、回归系数的潜在偏差以及暴露年龄敏感性的不确定性。
在个人所需的风险方程参数缺失的时间段内,通过使用组水平信息或插值的插补模型来填补参数。添加一个随机分量以匹配无需插补的研究对象的估计值中的方差。重复该过程以获得多个数据集,这些数据集可以通过 Rubin 规则进行统计组合,以在考虑到插补引入的不确定性的情况下发展置信限。为了测试可能的病例和对照之间的回忆偏差,当通过访谈获得历史居住地点时可能会发生这种情况,并且可能导致通过疾病状态对插补暴露进行错误分类,我们引入了一个“不完整性指数”,等于受试者的剂量插补百分比(PDI)。可以使用相对风险对暴露年龄的不同函数依赖性来计算“有效剂量”,从而可以比较不同的风险模型。为了说明我们的方法,我们在纽约长岛的一项已建立的病例对照研究中量化了交通空气污染的终生暴露(剂量),其中在几十年的时间里发生了大量的移民。
主要结果是描述的插补方法。说明性示例揭示了潜在的回忆偏差,这表明应该根据 PDI 对健康数据进行回归,以检查结果的一致性。1%的研究对象在交通繁忙的交叉口附近居住了很长时间,累积暴露量非常高。因此,插补方法必须设计为再现非标准分布。
我们的方法满足了将终生历史暴露重建扩展到许多方法学挑战,并为环境流行病学提供了希望。将在随后的手稿中报告应用于评估乳腺癌风险的情况。