From the Department of Public Health and Community Medicine, Tufts University School of Medicine, Boston, MA.
Mayo Clinic Alix School of Medicine, Mayo Clinic College of Medicine and Science, Rochester, MN.
Epidemiology. 2024 Sep 1;35(5):642-653. doi: 10.1097/EDE.0000000000001758. Epub 2024 Jun 11.
Causal graphs are an important tool for covariate selection but there is limited applied research on how best to create them. Here, we used data from the Coronary Drug Project trial to assess a range of approaches to directed acyclic graph (DAG) creation. We focused on the effect of adherence on mortality in the placebo arm, since the true causal effect is believed with a high degree of certainty.
We created DAGs for the effect of placebo adherence on mortality using different approaches for identifying variables and links to include or exclude. For each DAG, we identified minimal adjustment sets of covariates for estimating our causal effect of interest and applied these to analyses of the Coronary Drug Project data.
When we used only baseline covariate values to estimate the cumulative effect of placebo adherence on mortality, all adjustment sets performed similarly. The specific choice of covariates had minimal effect on these (biased) point estimates, but including nonconfounding prognostic factors resulted in smaller variance estimates. When we additionally adjusted for time-varying covariates of adherence using inverse probability weighting, covariates identified from the DAG created by focusing on prognostic factors performed best.
Theoretical advice on covariate selection suggests that including prognostic factors that are not exposure predictors can reduce variance without increasing bias. In contrast, for exposure predictors that are not prognostic factors, inclusion may result in less bias control. Our results empirically confirm this advice. We recommend that hand-creating DAGs begin with the identification of all potential outcome prognostic factors.
因果图是协变量选择的重要工具,但关于如何最好地创建因果图的应用研究有限。在这里,我们使用冠状动脉药物项目试验的数据来评估一系列用于创建有向无环图 (DAG) 的方法。我们专注于药物治疗组中药物依从性对死亡率的影响,因为人们非常确定该因果效应的真实性。
我们使用不同的方法来确定要包含或排除的变量和链接,为药物治疗组中药物依从性对死亡率的影响创建 DAG。对于每个 DAG,我们确定了最小的协变量调整集,以估计我们感兴趣的因果效应,并将这些调整集应用于冠状动脉药物项目数据的分析中。
当我们仅使用基线协变量值来估计药物治疗组中药物依从性对死亡率的累积影响时,所有调整集的表现相似。协变量的具体选择对这些(有偏)点估计值的影响很小,但包含非混杂预后因素会导致较小的方差估计值。当我们使用逆概率加权法额外调整药物治疗组中药物依从性的时变协变量时,从关注预后因素的 DAG 中确定的协变量表现最佳。
关于协变量选择的理论建议表明,包含不是暴露预测因素的预后因素可以在不增加偏差的情况下减少方差。相比之下,对于不是预后因素的暴露预测因素,包含它们可能会导致偏差控制效果不佳。我们的结果从经验上证实了这一建议。我们建议从确定所有潜在结果预后因素开始手动创建 DAG。