Observational Health Data Sciences and Informatics, New York, NY 10032;
Epidemiology Analytics, Janssen Research & Development, Titusville, NJ 08560.
Proc Natl Acad Sci U S A. 2018 Mar 13;115(11):2571-2577. doi: 10.1073/pnas.1708282114.
Observational healthcare data, such as electronic health records and administrative claims, offer potential to estimate effects of medical products at scale. Observational studies have often been found to be nonreproducible, however, generating conflicting results even when using the same database to answer the same question. One source of discrepancies is error, both random caused by sampling variability and systematic (for example, because of confounding, selection bias, and measurement error). Only random error is typically quantified but converges to zero as databases become larger, whereas systematic error persists independent from sample size and therefore, increases in relative importance. Negative controls are exposure-outcome pairs, where one believes no causal effect exists; they can be used to detect multiple sources of systematic error, but interpreting their results is not always straightforward. Previously, we have shown that an empirical null distribution can be derived from a sample of negative controls and used to calibrate values, accounting for both random and systematic error. Here, we extend this work to calibration of confidence intervals (CIs). CIs require positive controls, which we synthesize by modifying negative controls. We show that our CI calibration restores nominal characteristics, such as 95% coverage of the true effect size by the 95% CI. We furthermore show that CI calibration reduces disagreement in replications of two pairs of conflicting observational studies: one related to dabigatran, warfarin, and gastrointestinal bleeding and one related to selective serotonin reuptake inhibitors and upper gastrointestinal bleeding. We recommend CI calibration to improve reproducibility of observational studies.
观察性医疗保健数据,如电子健康记录和行政索赔,提供了在大规模估计医疗产品效果的潜力。然而,观察性研究经常被发现是不可复制的,即使使用相同的数据库来回答相同的问题,也会产生相互矛盾的结果。差异的一个来源是错误,包括由抽样变异性引起的随机错误和系统错误(例如,由于混杂、选择偏差和测量误差)。只有随机误差通常是定量的,但随着数据库的增大而趋于零,而系统误差独立于样本量持续存在,因此相对重要性增加。负对照是暴露-结局对,其中人们认为不存在因果效应;它们可用于检测多种系统误差源,但解释其结果并不总是直截了当。以前,我们已经表明,可以从负对照的样本中得出一个经验性的零分布,并将其用于校准 值,以考虑随机误差和系统误差。在这里,我们将这项工作扩展到置信区间(CI)的校准。CI 需要阳性对照,我们通过修改负对照来合成阳性对照。我们表明,我们的 CI 校准恢复了名义特征,例如 95%CI 以 95%的置信度覆盖真实效应大小。我们还表明,CI 校准减少了两项相互矛盾的观察性研究的重复之间的分歧:一项与达比加群、华法林和胃肠道出血有关,另一项与选择性 5-羟色胺再摄取抑制剂和上胃肠道出血有关。我们建议进行 CI 校准,以提高观察性研究的可重复性。