Cattaneo Matias D, Keele Luke, Titiunik Rocío
Dept. of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey, USA.
Dept. of Surgery, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Stat Med. 2023 Oct 30;42(24):4484-4513. doi: 10.1002/sim.9861. Epub 2023 Aug 1.
We present a practical guide for the analysis of regression discontinuity (RD) designs in biomedical contexts. We begin by introducing key concepts, assumptions, and estimands within both the continuity-based framework and the local randomization framework. We then discuss modern estimation and inference methods within both frameworks, including approaches for bandwidth or local neighborhood selection, optimal treatment effect point estimation, and robust bias-corrected inference methods for uncertainty quantification. We also overview empirical falsification tests that can be used to support key assumptions. Our discussion focuses on two particular features that are relevant in biomedical research: (i) fuzzy RD designs, which often arise when therapeutic treatments are based on clinical guidelines, but patients with scores near the cutoff are treated contrary to the assignment rule; and (ii) RD designs with discrete scores, which are ubiquitous in biomedical applications. We illustrate our discussion with three empirical applications: the effect CD4 guidelines for anti-retroviral therapy on retention of HIV patients in South Africa, the effect of genetic guidelines for chemotherapy on breast cancer recurrence in the United States, and the effects of age-based patient cost-sharing on healthcare utilization in Taiwan. Complete replication materials employing publicly available data and statistical software in Python, R and Stata are provided, offering researchers all necessary tools to conduct an RD analysis.
我们提供了一份在生物医学背景下分析回归断点(RD)设计的实用指南。我们首先在基于连续性的框架和局部随机化框架内介绍关键概念、假设和估计量。然后,我们讨论这两个框架内的现代估计和推断方法,包括带宽或局部邻域选择方法、最优治疗效果点估计方法,以及用于不确定性量化的稳健偏差校正推断方法。我们还概述了可用于支持关键假设的经验性证伪检验。我们的讨论重点关注生物医学研究中相关的两个特定特征:(i)模糊RD设计,这种设计通常出现在基于临床指南的治疗性治疗中,但分数接近临界值的患者的治疗与分配规则相反;(ii)具有离散分数的RD设计,这种设计在生物医学应用中很普遍。我们用三个实证应用来说明我们的讨论:南非抗逆转录病毒治疗的CD4指南对艾滋病毒患者留存率的影响、美国化疗的基因指南对乳腺癌复发的影响,以及台湾基于年龄的患者费用分担对医疗保健利用的影响。我们提供了使用公开可用数据以及Python、R和Stata统计软件的完整复制材料,为研究人员提供了进行RD分析所需的所有工具。