两阶段 g 计算：当暴露信息部分缺失时，在观察性队列中评估处理和干预效果。

Two-stage g-computation: Evaluating Treatment and Intervention Impacts in Observational Cohorts When Exposure Information Is Partly Missing.

机构信息

From the Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC.

Department of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC.

出版信息

Epidemiology. 2020 Sep;31(5):695-703. doi: 10.1097/EDE.0000000000001233.

DOI:10.1097/EDE.0000000000001233

PMID:32657953

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8725064/

Abstract

Illustrations of the g-computation algorithm to evaluate population average treatment and intervention effects have been predominantly implemented in settings with complete exposure information. Thus, worked examples of approaches to handle missing data in this causal framework are needed to facilitate wider use of these estimators. We illustrate two-stage g-computation estimators that leverage partially observed information on the full study sample and complete exposure information on a subset to estimate causal effects. In a hypothetical cohort of 1,623 human immunodeficiency virus (HIV)-positive women with 30% complete opioid prescription information, we illustrate a two-stage extrapolation g-computation estimator for the average treatment effect of shorter or longer duration opioid prescriptions; we further illustrate two-stage inverse probability weighting and imputation g-computation estimators for the average intervention effect of shortening the duration of prescriptions relative to the status quo. Two-stage g-computation estimators approximated the true risk differences for the population average treatment and intervention effects while g-computation fit to the subset of complete cases was biased. In 10,000 Monte Carlo simulations, two-stage approaches considerably reduced bias and mean squared error and improved the coverage of 95% confidence limits. Although missing data threaten validity and precision, two-stage g-computation designs offer principled approaches to handling missing information.

摘要

图形化运算算法（g-computation algorithm）用于评估人群平均处理效应和干预效应，主要应用于完全暴露信息的场景。因此，需要针对该因果框架中处理缺失数据的方法提供实际案例，以促进这些估计器的更广泛使用。我们展示了两种利用部分观察到的完整研究样本信息和部分完整暴露信息的两阶段 g 计算估计器，以估计因果效应。在一个假设的有 1623 名人类免疫缺陷病毒（HIV）阳性妇女的队列中，我们展示了一种两阶段外推 g 计算估计器，用于估计较短或较长时间的阿片类药物处方的平均治疗效果；我们进一步展示了两种两阶段逆概率加权和插补 g 计算估计器，用于估计相对于现状缩短处方持续时间的平均干预效果。两阶段 g 计算估计器接近了人群平均治疗和干预效果的真实风险差异，而对完整案例子集进行 g 计算拟合则存在偏差。在 10000 次蒙特卡罗模拟中，两阶段方法大大减少了偏差和均方误差，并提高了 95%置信区间的覆盖范围。尽管缺失数据会威胁到有效性和精度，但两阶段 g 计算设计提供了处理缺失信息的原则性方法。