Holovchak Anastasiia, McIlleron Helen, Denti Paolo, Schomaker Michael
Seminar für Statistik, ETH Zürich, Rämistrasse 101, 8092 Zürich, Switzerland.
Division of Clinical Pharmacology, Department of Medicine, Faculty of Health Sciences, University of Cape Town, 7935 Observatory, Cape Town, South Africa.
Biostatistics. 2024 Dec 31;26(1). doi: 10.1093/biostatistics/kxae044.
Missing data in multiple variables is a common issue. We investigate the applicability of the framework of graphical models for handling missing data to a complex longitudinal pharmacological study of children with HIV treated with an efavirenz-based regimen as part of the CHAPAS-3 trial. Specifically, we examine whether the causal effects of interest, defined through static interventions on multiple continuous variables, can be recovered (estimated consistently) from the available data only. So far, no general algorithms are available to decide on recoverability, and decisions have to be made on a case-by-case basis. We emphasize the sensitivity of recoverability to even the smallest changes in the graph structure, and present recoverability results for three plausible missingness-directed acyclic graphs (m-DAGs) in the CHAPAS-3 study, informed by clinical knowledge. Furthermore, we propose the concept of a "closed missingness mechanism": if missing data are generated based on this mechanism, an available case analysis is admissible for consistent estimation of any statistical or causal estimand, even if data are missing not at random. Both simulations and theoretical considerations demonstrate how, in the assumed MNAR setting of our study, a complete or available case analysis can be superior to multiple imputation, and estimation results vary depending on the assumed missingness DAG. Our analyses demonstrate an innovative application of missingness DAGs to complex longitudinal real-world data, while highlighting the sensitivity of the results with respect to the assumed causal model.
多个变量中的数据缺失是一个常见问题。我们研究了图形模型框架在处理缺失数据方面的适用性,该框架应用于一项针对接受依非韦伦治疗方案的艾滋病毒感染儿童的复杂纵向药理学研究,此研究是CHAPAS - 3试验的一部分。具体而言,我们检验了通过对多个连续变量进行静态干预所定义的感兴趣的因果效应,是否仅能从现有数据中恢复(一致估计)。到目前为止,尚无通用算法可用于判定可恢复性,必须逐案做出决策。我们强调了可恢复性对图结构中哪怕是最小变化的敏感性,并在临床知识的指导下,给出了CHAPAS - 3研究中三个合理的缺失导向无环图(m - DAG)的可恢复性结果。此外,我们提出了“封闭缺失机制”的概念:如果基于此机制生成缺失数据,那么即使数据并非随机缺失,可用病例分析对于一致估计任何统计或因果估计量也是可行的。模拟和理论考量均表明,在我们研究中假设的非随机缺失(MNAR)情况下,完整或可用病例分析如何能够优于多重填补,并且估计结果会因假设的缺失DAG而有所不同。我们的分析展示了缺失DAG在复杂纵向实际数据中的创新应用,同时突出了结果相对于假设因果模型的敏感性。