Matthay Ellicott C, Hagan Erin, Gottlieb Laura M, Tan May Lynn, Vlahov David, Adler Nancy E, Glymour M Maria
Center for Health and Community, University of California, San Francisco, 3333, California St, Suite, 465, Campus Box 0844, San Francisco, CA, 94143-0844, USA.
Department of Epidemiology and Biostatistics, University of California, San Francisco, 550 16th Street, 2nd Floor, Campus Box 0560, San Francisco, CA, 94143, USA.
SSM Popul Health. 2019 Dec 9;10:100526. doi: 10.1016/j.ssmph.2019.100526. eCollection 2020 Apr.
Population health researchers from different fields often address similar substantive questions but rely on different study designs, reflecting their home disciplines. This is especially true in studies involving causal inference, for which semantic and substantive differences inhibit interdisciplinary dialogue and collaboration. In this paper, we group nonrandomized study designs into two categories: those that use confounder-control (such as regression adjustment or propensity score matching) and those that rely on an instrument (such as instrumental variables, regression discontinuity, or differences-in-differences approaches). Using the Shadish, Cook, and Campbell framework for evaluating threats to validity, we contrast the assumptions, strengths, and limitations of these two approaches and illustrate differences with examples from the literature on education and health. Across disciplines, all methods to test a hypothesized causal relationship involve unverifiable assumptions, and rarely is there clear justification for exclusive reliance on one method. Each method entails trade-offs between statistical power, internal validity, measurement quality, and generalizability. The choice between confounder-control and instrument-based methods should be guided by these tradeoffs and consideration of the most important limitations of previous work in the area. Our goals are to foster common understanding of the methods available for causal inference in population health research and the tradeoffs between them; to encourage researchers to objectively evaluate what can be learned from methods outside one's home discipline; and to facilitate the selection of methods that best answer the investigator's scientific questions.
来自不同领域的人群健康研究人员常常会探讨相似的实质性问题,但由于各自学科背景的不同,他们所依赖的研究设计也有所差异。在涉及因果推断的研究中,这种情况尤为明显,因为语义和实质性的差异阻碍了跨学科的对话与合作。在本文中,我们将非随机研究设计分为两类:一类是使用混杂因素控制的方法(如回归调整或倾向得分匹配),另一类是依赖工具变量的方法(如工具变量法、断点回归设计或差分法)。运用沙迪什、库克和坎贝尔用于评估效度威胁的框架,我们对比了这两种方法的假设、优势和局限性,并通过教育和健康领域的文献实例来说明其中的差异。在各个学科中,所有用于检验假设因果关系的方法都涉及无法验证的假设,而且很少有明确的理由来支持只依赖一种方法。每种方法都需要在统计效力、内部效度、测量质量和可推广性之间进行权衡。在混杂因素控制法和基于工具变量的方法之间进行选择时,应依据这些权衡以及对该领域先前研究最重要局限性的考量。我们的目标是促进对人群健康研究中可用于因果推断的方法及其相互之间权衡的共同理解;鼓励研究人员客观地评估从本学科以外的方法中能学到什么;并推动选择最能回答研究者科学问题的方法。