Am J Epidemiol. 2013 May 1;177(9):904-12. doi: 10.1093/aje/kws340. Epub 2013 Apr 4.
Outcome misclassification is widespread in epidemiology, but methods to account for it are rarely used. We describe the use of multiple imputation to reduce bias when validation data are available for a subgroup of study participants. This approach is illustrated using data from 308 participants in the multicenter Herpetic Eye Disease Study between 1992 and 1998 (48% female; 85% white; median age, 49 years). The odds ratio comparing the acyclovir group with the placebo group on the gold-standard outcome (physician-diagnosed herpes simplex virus recurrence) was 0.62 (95% confidence interval (CI): 0.35, 1.09). We masked ourselves to physician diagnosis except for a 30% validation subgroup used to compare methods. Multiple imputation (odds ratio (OR) = 0.60; 95% CI: 0.24, 1.51) was compared with naive analysis using self-reported outcomes (OR = 0.90; 95% CI: 0.47, 1.73), analysis restricted to the validation subgroup (OR = 0.57; 95% CI: 0.20, 1.59), and direct maximum likelihood (OR = 0.62; 95% CI: 0.26, 1.53). In simulations, multiple imputation and direct maximum likelihood had greater statistical power than did analysis restricted to the validation subgroup, yet all 3 provided unbiased estimates of the odds ratio. The multiple-imputation approach was extended to estimate risk ratios using log-binomial regression. Multiple imputation has advantages regarding flexibility and ease of implementation for epidemiologists familiar with missing data methods.
结局错误分类在流行病学中很常见,但很少使用解决方法。我们描述了在验证数据可用于研究参与者的亚组的情况下,使用多重插补来减少偏差的方法。该方法使用了 1992 年至 1998 年间进行的多中心疱疹性眼病研究中的 308 名参与者的数据(女性占 48%;白人占 85%;中位年龄为 49 岁)进行说明。比较阿昔洛韦组与安慰剂组在黄金标准结局(医生诊断的单纯疱疹病毒复发)上的优势比为 0.62(95%置信区间:0.35,1.09)。我们对医生的诊断进行了蒙蔽,除了用于比较方法的 30%验证亚组外。多重插补(优势比(OR)=0.60;95%置信区间(CI):0.24,1.51)与使用自我报告结局的简单分析(OR=0.90;95%CI:0.47,1.73)、仅限于验证亚组的分析(OR=0.57;95%CI:0.20,1.59)和直接最大似然(OR=0.62;95%CI:0.26,1.53)进行了比较。在模拟中,多重插补和直接最大似然比仅限于验证亚组的分析具有更高的统计能力,但这 3 种方法都提供了无偏的优势比估计。多重插补方法扩展到使用对数二项式回归来估计风险比。对于熟悉缺失数据方法的流行病学家来说,多重插补在灵活性和易于实施方面具有优势。