Duarte Guilherme, Finkelstein Noam, Knox Dean, Mummolo Jonathan, Shpitser Ilya
Operations, Information and Decisions Department, The Wharton School of the University of Pennsylvania, Philadelphia, PA.
Department of Computer Science, Whiting School of Engineering at the Johns Hopkins University, Baltimore, MD.
J Am Stat Assoc. 2024;119(547):1778-1793. doi: 10.1080/01621459.2023.2216909. Epub 2023 Aug 21.
Applied research conditions often make it impossible to point-identify causal estimands without untenable assumptions. -bounds on the range of possible solutions-is a principled alternative, but the difficulty of deriving bounds in idiosyncratic settings has restricted its application. We present a general, automated numerical approach to causal inference in discrete settings. We show causal questions with discrete data reduce to polynomial programming problems, then present an algorithm to automatically bound causal effects using efficient dual relaxation and spatial branch-and-bound techniques. The user declares an estimand, states assumptions, and provides data-however incomplete or mismeasured. The algorithm then searches over admissible data-generating processes and outputs the most precise possible range consistent with available information-that is, bounds-including a point-identified solution if one exists. Because this search can be computationally intensive, our procedure reports and continually refines non-sharp ranges guaranteed to contain the truth at all times, even when the algorithm is not run to completion. Moreover, it offers an -sharpness guarantee, characterizing the worst-case looseness of the incomplete bounds. These techniques are implemented in our Python package, autobounds. Analytically validated simulations show the method accommodates classic obstacles-including confounding, selection, measurement error, noncompliance, and nonresponse. Supplementary materials for this article are available online.
应用研究条件往往使得在不做不合理假设的情况下无法精确识别因果估计量。——对可能的解决方案范围进行界定——是一种合理的替代方法,但在特殊情况下推导界定的困难限制了其应用。我们提出了一种在离散环境中进行因果推断的通用自动化数值方法。我们表明,离散数据的因果问题可归结为多项式规划问题,然后提出一种算法,利用有效的对偶松弛和空间分支定界技术自动界定因果效应。用户声明一个估计量,陈述假设,并提供数据——无论数据多么不完整或测量有误。然后,该算法会在可允许的数据生成过程中进行搜索,并输出与可用信息一致的最精确可能范围——即界定范围——如果存在点识别解也会输出。由于这种搜索计算量可能很大,我们的程序会报告并不断细化非精确范围,确保始终包含真实值,即使算法未运行至完成。此外,它还提供了一个锐度保证,描述了不完整界定的最坏情况宽松程度。这些技术已在我们的Python包autobounds中实现。经过分析验证的模拟表明,该方法能够应对包括混杂、选择、测量误差、不依从和无应答在内的经典障碍。本文的补充材料可在线获取。