From the RAND Corporation, Arlington, VA.
Johns Hopkins Bloomberg School of Public Health, Baltimore, MD.
Epidemiology. 2023 Nov 1;34(6):856-864. doi: 10.1097/EDE.0000000000001659. Epub 2023 Sep 26.
Policy evaluation studies that assess how state-level policies affect health-related outcomes are foundational to health and social policy research. The relative ability of newer analytic methods to address confounding, a key source of bias in observational studies, has not been closely examined.
We conducted a simulation study to examine how differing magnitudes of confounding affected the performance of 4 methods used for policy evaluations: (1) the two-way fixed effects difference-in-differences model; (2) a 1-period lagged autoregressive model; (3) augmented synthetic control method; and (4) the doubly robust difference-in-differences approach with multiple time periods from Callaway-Sant'Anna. We simulated our data to have staggered policy adoption and multiple confounding scenarios (i.e., varying the magnitude and nature of confounding relationships).
Bias increased for each method: (1) as confounding magnitude increases; (2) when confounding is generated with respect to prior outcome trends (rather than levels), and (3) when confounding associations are nonlinear (rather than linear). The autoregressive model and augmented synthetic control method had notably lower root mean squared error than the two-way fixed effects and Callaway-Sant'Anna approaches for all scenarios; the exception is nonlinear confounding by prior trends, where Callaway-Sant'Anna excels. Coverage rates were unreasonably high for the augmented synthetic control method (e.g., 100%), reflecting large model-based standard errors and wide confidence intervals in practice.
In our simulation study, no single method consistently outperformed the others, but a researcher's toolkit should include all methodologic options. Our simulations and associated R package can help researchers choose the most appropriate approach for their data.
评估州级政策如何影响健康相关结果的政策评估研究是健康和社会政策研究的基础。新的分析方法解决混杂的相对能力,这是观察性研究中偏倚的一个关键来源,尚未得到密切研究。
我们进行了一项模拟研究,以检验不同程度的混杂如何影响用于政策评估的 4 种方法的性能:(1)双向固定效应差分法;(2)1 期滞后自回归模型;(3)增强型综合控制方法;(4)Callaway-Sant'Anna 的具有多个时间段的双重稳健差分法。我们模拟了我们的数据,以实现交错的政策采用和多种混杂情况(即,改变混杂关系的幅度和性质)。
每种方法的偏差都增加了:(1)随着混杂幅度的增加;(2)当混杂与先前的结果趋势(而不是水平)有关时;(3)当混杂关联是非线性(而不是线性)时。对于所有情况,自回归模型和增强型综合控制方法的均方根误差明显低于双向固定效应和 Callaway-Sant'Anna 方法;非线性先前趋势混杂除外,其中 Callaway-Sant'Anna 表现出色。增强型综合控制方法的覆盖率非常高(例如 100%),这反映了实践中基于模型的标准误差大和置信区间宽。
在我们的模拟研究中,没有一种方法始终优于其他方法,但研究人员的工具包应该包括所有方法选项。我们的模拟和相关的 R 包可以帮助研究人员为他们的数据选择最合适的方法。