Department of Health Management and Policy, School of Public Health, University of Michigan, Ann Arbor, MI, USA.
Center for Health Informatics, University of Manchester, Manchester, UK.
Stat Methods Med Res. 2019 Dec;28(12):3697-3711. doi: 10.1177/0962280218814570. Epub 2018 Nov 25.
Difference-in-differences (DID) analysis is used widely to estimate the causal effects of health policies and interventions. A critical assumption in DID is "parallel trends": that pre-intervention trends in outcomes are the same between treated and comparison groups. To date, little guidance has been available to researchers who wish to use DID when the parallel trends assumption is violated. Using a Monte Carlo simulation experiment, we tested the performance of several estimators (standard DID; DID with propensity score matching; single-group interrupted time-series analysis; and multi-group interrupted time-series analysis) when the parallel trends assumption is violated. Using nationwide data from US hospitals (n = 3737) for seven data periods (four pre-interventions and three post-interventions), we used alternative estimators to evaluate the effect of a placebo intervention on common outcomes in health policy (clinical process quality and 30-day risk-standardized mortality for acute myocardial infarction, heart failure, and pneumonia). Estimator performance was assessed using mean-squared error and estimator coverage. We found that mean-squared error values were considerably lower for the DID estimator with matching than for the standard DID or interrupted time-series analysis models. The DID estimator with matching also had superior performance for estimator coverage. Our findings were robust across all outcomes evaluated.
差异中的差异(DID)分析被广泛用于估计卫生政策和干预措施的因果效应。DID 的一个关键假设是“平行趋势”:即治疗组和对照组在干预前的结果趋势是相同的。迄今为止,当平行趋势假设被违反时,希望使用 DID 的研究人员几乎没有得到指导。我们使用蒙特卡罗模拟实验,测试了当平行趋势假设被违反时,几种估计器(标准 DID;具有倾向得分匹配的 DID;单组中断时间序列分析;和多组中断时间序列分析)的性能。我们使用来自美国医院的全国范围数据(n=3737)进行了七个数据期(四个干预前和三个干预后),使用替代估计器来评估安慰剂干预对卫生政策中常见结果的影响(临床过程质量和急性心肌梗死、心力衰竭和肺炎的 30 天风险标准化死亡率)。使用均方误差和估计器覆盖率评估估计器性能。我们发现,具有匹配的 DID 估计器的均方误差值明显低于标准 DID 或中断时间序列分析模型。具有匹配的 DID 估计器在估计器覆盖率方面也具有更好的性能。我们的发现对于评估的所有结果都是稳健的。