Suppr超能文献

超越经典的双重差分模型:一种比较估计州级政策有效性的统计方法的模拟研究。

Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies.

机构信息

RAND Corporation, 1200 South Hayes Street, Arlington, VA, 22202, USA.

Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA.

出版信息

BMC Med Res Methodol. 2021 Dec 13;21(1):279. doi: 10.1186/s12874-021-01471-y.

Abstract

BACKGROUND

Reliable evaluations of state-level policies are essential for identifying effective policies and informing policymakers' decisions. State-level policy evaluations commonly use a difference-in-differences (DID) study design; yet within this framework, statistical model specification varies notably across studies. More guidance is needed about which set of statistical models perform best when estimating how state-level policies affect outcomes.

METHODS

Motivated by applied state-level opioid policy evaluations, we implemented an extensive simulation study to compare the statistical performance of multiple variations of the two-way fixed effect models traditionally used for DID under a range of simulation conditions. We also explored the performance of autoregressive (AR) and GEE models. We simulated policy effects on annual state-level opioid mortality rates and assessed statistical performance using various metrics, including directional bias, magnitude bias, and root mean squared error. We also reported Type I error rates and the rate of correctly rejecting the null hypothesis (e.g., power), given the prevalence of frequentist null hypothesis significance testing in the applied literature.

RESULTS

Most linear models resulted in minimal bias. However, non-linear models and population-weighted versions of classic linear two-way fixed effect and linear GEE models yielded considerable bias (60 to 160%). Further, root mean square error was minimized by linear AR models when we examined crude mortality rates and by negative binomial models when we examined raw death counts. In the context of frequentist hypothesis testing, many models yielded high Type I error rates and very low rates of correctly rejecting the null hypothesis (< 10%), raising concerns of spurious conclusions about policy effectiveness in the opioid literature. When considering performance across models, the linear AR models were optimal in terms of directional bias, root mean squared error, Type I error, and correct rejection rates.

CONCLUSIONS

The findings highlight notable limitations of commonly used statistical models for DID designs, which are widely used in opioid policy studies and in state policy evaluations more broadly. In contrast, the optimal model we identified--the AR model--is rarely used in state policy evaluation. We urge applied researchers to move beyond the classic DID paradigm and adopt use of AR models.

摘要

背景

可靠的州级政策评估对于确定有效政策和为决策者提供决策依据至关重要。州级政策评估通常采用双重差分(DID)研究设计;然而,在这个框架内,统计模型的规格在不同的研究中差异显著。在估计州级政策如何影响结果时,需要更多关于哪些统计模型组合表现最佳的指导。

方法

受应用于州级阿片类药物政策评估的启发,我们进行了一项广泛的模拟研究,以比较在一系列模拟条件下,传统用于 DID 的双向固定效应模型的多种变体的统计性能。我们还探索了自回归(AR)和广义估计方程(GEE)模型的性能。我们模拟了政策对年度州级阿片类药物死亡率的影响,并使用各种指标评估了统计性能,包括方向偏差、幅度偏差和均方根误差。我们还报告了给定频繁主义零假设显著性检验在应用文献中的普遍性时,错误拒绝零假设(例如,功效)的比率。

结果

大多数线性模型导致的偏差最小。然而,非线性模型和经典线性双向固定效应和线性 GEE 模型的加权版本产生了相当大的偏差(60%至 160%)。此外,当我们检查原始死亡率时,线性 AR 模型最小化了均方根误差,当我们检查原始死亡人数时,负二项式模型最小化了均方根误差。在频繁主义假设检验的背景下,许多模型产生了高的Ⅰ类错误率和极低的正确拒绝零假设的比率(<10%),这引起了人们对阿片类药物文献中关于政策有效性的虚假结论的担忧。当考虑模型之间的性能时,线性 AR 模型在方向偏差、均方根误差、Ⅰ类错误和正确拒绝率方面表现最佳。

结论

这些发现突出了常用于 DID 设计的统计模型的显著局限性,这些模型广泛应用于阿片类药物政策研究以及更广泛的州级政策评估。相比之下,我们确定的最优模型——AR 模型——在州级政策评估中很少使用。我们敦促应用研究人员超越经典 DID 范式,采用 AR 模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03b9/8667411/42f51a042e8e/12874_2021_1471_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验