Murray Eleanor J, Robins James M, Seage George R, Freedberg Kenneth A, Hernán Miguel A
Am J Epidemiol. 2017 Jul 15;186(2):131-142. doi: 10.1093/aje/kwx091.
Decision-making requires choosing from treatments on the basis of correctly estimated outcome distributions under each treatment. In the absence of randomized trials, 2 possible approaches are the parametric g-formula and agent-based models (ABMs). The g-formula has been used exclusively to estimate effects in the population from which data were collected, whereas ABMs are commonly used to estimate effects in multiple populations, necessitating stronger assumptions. Here, we describe potential biases that arise when ABM assumptions do not hold. To do so, we estimated 12-month mortality risk in simulated populations differing in prevalence of an unknown common cause of mortality and a time-varying confounder. The ABM and g-formula correctly estimated mortality and causal effects when all inputs were from the target population. However, whenever any inputs came from another population, the ABM gave biased estimates of mortality-and often of causal effects even when the true effect was null. In the absence of unmeasured confounding and model misspecification, both methods produce valid causal inferences for a given population when all inputs are from that population. However, ABMs may result in bias when extrapolated to populations that differ on the distribution of unmeasured outcome determinants, even when the causal network linking variables is identical.
决策需要在正确估计每种治疗下的结果分布的基础上从各种治疗方法中进行选择。在缺乏随机试验的情况下,两种可能的方法是参数化g公式和基于主体的模型(ABM)。g公式仅用于估计收集数据的人群中的效应,而ABM通常用于估计多个群体中的效应,这需要更强的假设。在这里,我们描述了当ABM假设不成立时出现的潜在偏差。为此,我们在模拟人群中估计了12个月的死亡风险,这些人群在未知的常见死亡原因的患病率和一个随时间变化的混杂因素方面存在差异。当所有输入都来自目标人群时,ABM和g公式能够正确估计死亡率和因果效应。然而,只要有任何输入来自其他人群,ABM就会给出有偏差的死亡率估计值——甚至在真实效应为零的情况下,通常也会给出有偏差的因果效应估计值。在不存在未测量的混杂因素和模型错误设定的情况下,当所有输入都来自给定人群时,两种方法都能对该人群做出有效的因果推断。然而,即使连接变量的因果网络相同,当ABM外推到未测量结果决定因素分布不同的人群时,也可能导致偏差。