Smith Matthew J, Mansournia Mohammad A, Maringe Camille, Zivich Paul N, Cole Stephen R, Leyrat Clémence, Belot Aurélien, Rachet Bernard, Luque-Fernandez Miguel A
Inequalities in Cancer Outcomes Network, Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK.
Department of Epidemiology and Biostatistics, Tehran University of Medical Sciences, Tehran, Iran.
Stat Med. 2022 Jan 30;41(2):407-432. doi: 10.1002/sim.9234. Epub 2021 Oct 28.
The main purpose of many medical studies is to estimate the effects of a treatment or exposure on an outcome. However, it is not always possible to randomize the study participants to a particular treatment, therefore observational study designs may be used. There are major challenges with observational studies; one of which is confounding. Controlling for confounding is commonly performed by direct adjustment of measured confounders; although, sometimes this approach is suboptimal due to modeling assumptions and misspecification. Recent advances in the field of causal inference have dealt with confounding by building on classical standardization methods. However, these recent advances have progressed quickly with a relative paucity of computational-oriented applied tutorials contributing to some confusion in the use of these methods among applied researchers. In this tutorial, we show the computational implementation of different causal inference estimators from a historical perspective where new estimators were developed to overcome the limitations of the previous estimators (ie, nonparametric and parametric g-formula, inverse probability weighting, double-robust, and data-adaptive estimators). We illustrate the implementation of different methods using an empirical example from the Connors study based on intensive care medicine, and most importantly, we provide reproducible and commented code in Stata, R, and Python for researchers to adapt in their own observational study. The code can be accessed at https://github.com/migariane/Tutorial_Computational_Causal_Inference_Estimators.
许多医学研究的主要目的是评估一种治疗方法或暴露因素对某个结果的影响。然而,并非总是能够将研究参与者随机分配到特定的治疗组,因此可能会采用观察性研究设计。观察性研究存在重大挑战,其中之一就是混杂因素。控制混杂因素通常通过直接调整已测量的混杂因素来实现;不过,有时由于建模假设和错误设定,这种方法并非最优。因果推断领域的最新进展基于经典标准化方法来处理混杂因素。然而,这些最新进展发展迅速,而面向计算的应用教程相对较少,这导致应用研究人员在使用这些方法时产生了一些困惑。在本教程中,我们从历史角度展示了不同因果推断估计量的计算实现过程,在这个过程中开发了新的估计量以克服先前估计量(即非参数和参数化g公式、逆概率加权、双重稳健和数据自适应估计量)的局限性。我们使用基于重症医学的康纳斯研究中的一个实证例子来说明不同方法的实现,最重要的是,我们提供了在Stata、R和Python中可重现且有注释的代码,供研究人员在自己的观察性研究中采用。代码可在https://github.com/migariane/Tutorial_Computational_Causal_Inference_Estimators上获取。