Janssen Research and Development LLC, 1125 Trenton-Harbourton Road, Room K30205, PO Box 200, Titusville, NJ, 08560, USA,
Drug Saf. 2013 Oct;36 Suppl 1:S143-58. doi: 10.1007/s40264-013-0108-9.
Observational healthcare data offer the potential to enable identification of risks of medical products, and the medical literature is replete with analyses that aim to accomplish this objective. A number of established analytic methods dominate the literature but their operating characteristics in real-world settings remain unknown.
To compare the performance of seven methods (new user cohort, case control, self-controlled case series, self-controlled cohort, disproportionality analysis, temporal pattern discovery, and longitudinal gamma poisson shrinker) as tools for risk identification in observational healthcare data.
The experiment applied each method to 399 drug-outcome scenarios (165 positive controls and 234 negative controls across 4 health outcomes of interest) in 5 real observational databases (4 administrative claims and 1 electronic health record).
Method performance was evaluated through Area Under the receiver operator characteristics Curve (AUC), bias, mean square error, and confidence interval coverage probability.
Multiple methods offer strong predictive accuracy, with AUC > 0.70 achievable for all outcomes and databases with more than one analytical approach. Self-controlled methods (self-controlled case series, temporal pattern discovery, self-controlled cohort) had higher predictive accuracy than cohort and case-control methods across all databases and outcomes. Methods differed in the expected value and variance of the error distribution. All methods had lower coverage probability than the expected nominal properties.
Observational healthcare data can inform risk identification of medical product effects on acute liver injury, acute myocardial infarction, acute renal failure and gastrointestinal bleeding. However, effect estimates from all methods require calibration to address inconsistency in method operating characteristics. Further empirical evaluation is required to gauge the generalizability of these findings to other databases and outcomes.
观察性医疗保健数据具有识别医疗产品风险的潜力,医学文献中充斥着旨在实现这一目标的分析。一些成熟的分析方法占据主导地位,但它们在实际环境中的运行特征尚不清楚。
比较七种方法(新用户队列、病例对照、自我对照病例系列、自我对照队列、比例失调分析、时间模式发现和纵向伽马泊松收缩器)在观察性医疗保健数据中作为风险识别工具的性能。
该实验将每种方法应用于 399 个药物-结果场景(4 个感兴趣的健康结果中,165 个阳性对照和 234 个阴性对照),涉及 5 个真实的观察性数据库(4 个行政索赔和 1 个电子健康记录)。
通过接收者操作特征曲线下的面积(AUC)、偏差、均方误差和置信区间覆盖概率来评估方法性能。
多种方法具有很强的预测准确性,所有结果和数据库的 AUC 均>0.70,使用的分析方法不止一种。在所有数据库和结果中,自我对照方法(自我对照病例系列、时间模式发现、自我对照队列)的预测准确性均高于队列和病例对照方法。方法在误差分布的期望和方差方面存在差异。所有方法的覆盖率均低于预期的名义特性。
观察性医疗保健数据可以为识别医疗产品对急性肝损伤、急性心肌梗死、急性肾衰竭和胃肠道出血的影响提供信息。然而,所有方法的效果估计都需要校准,以解决方法运行特征不一致的问题。需要进一步进行实证评估,以衡量这些发现对其他数据库和结果的普遍性。