Am J Epidemiol. 2024 Jul 8;193(7):1019-1030. doi: 10.1093/aje/kwae012.
Targeted maximum likelihood estimation (TMLE) is increasingly used for doubly robust causal inference, but how missing data should be handled when using TMLE with data-adaptive approaches is unclear. Based on data (1992-1998) from the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate 8 missing-data methods in this context: complete-case analysis, extended TMLE incorporating an outcome-missingness model, the missing covariate missing indicator method, and 5 multiple imputation (MI) approaches using parametric or machine-learning models. We considered 6 scenarios that varied in terms of exposure/outcome generation models (presence of confounder-confounder interactions) and missingness mechanisms (whether outcome influenced missingness in other variables and presence of interaction/nonlinear terms in missingness models). Complete-case analysis and extended TMLE had small biases when outcome did not influence missingness in other variables. Parametric MI without interactions had large bias when exposure/outcome generation models included interactions. Parametric MI including interactions performed best in bias and variance reduction across all settings, except when missingness models included a nonlinear term. When choosing a method for handling missing data in the context of TMLE, researchers must consider the missingness mechanism and, for MI, compatibility with the analysis method. In many settings, a parametric MI approach that incorporates interactions and nonlinearities is expected to perform well.
目标最大似然估计(TMLE)越来越多地用于双重稳健因果推断,但在使用数据自适应方法进行 TMLE 时,如何处理缺失数据尚不清楚。基于维多利亚青少年健康队列研究(1992-1998 年)的数据,我们进行了一项模拟研究,以评估 8 种在这种情况下的缺失数据方法:完全案例分析、纳入结果缺失模型的扩展 TMLE、缺失协变量缺失指示符方法以及使用参数或机器学习模型的 5 种多重插补(MI)方法。我们考虑了 6 种情况,这些情况在暴露/结果生成模型(混杂因素-混杂因素相互作用的存在)和缺失机制(结果是否影响其他变量的缺失以及缺失模型中是否存在相互作用/非线性项)方面有所不同。当结果不影响其他变量的缺失时,完全案例分析和扩展 TMLE 的偏差较小。当暴露/结果生成模型包括相互作用时,没有相互作用的参数 MI 存在较大的偏差。参数 MI 包括相互作用,除了缺失模型包括非线性项外,在所有情况下都能在偏差和方差减少方面表现最佳。在选择 TMLE 背景下处理缺失数据的方法时,研究人员必须考虑缺失机制,对于 MI,还必须考虑与分析方法的兼容性。在许多情况下,预计包含相互作用和非线性的参数 MI 方法将表现良好。