Cox Associates, Denver, CO, USA.
Crit Rev Toxicol. 2018 Sep;48(8):682-712. doi: 10.1080/10408444.2018.1518404. Epub 2018 Nov 15.
Perhaps no other topic in risk analysis is more difficult, more controversial, or more important to risk management policy analysts and decision-makers than how to draw valid, correctly qualified causal conclusions from observational data. Statistical methods can readily quantify associations between observed variables using measures such as relative risk (RR) ratios, odds ratios (OR), slope coefficients for exposure or treatment variables in regression models, and quantities derived from these measures. Textbooks of epidemiology explain how to calculate population attributable fractions, attributable risks, burden-of-disease estimates, and probabilities of causation from relative risk (RR) ratios. Despite their suggestive names, these association-based measures have no necessary connection to causation if the associations on which they are based arise from bias, confounding, p-hacking, coincident historical trends, or other noncausal sources. But policy analysts and decision makers need something more: trustworthy predictions - and, later, evaluations - of the changes in outcomes by changes in policy variables. This concept of differs from the more familiar concepts of associational and attributive causation most widely used in epidemiology. Drawing on modern literature on causal discovery and inference principles and algorithms for drawing limited but useful causal conclusions from observational data, we propose seven criteria for assessing consistency of data with a manipulative causal exposure-response relationship - mutual information, directed dependence, internal and external consistency, coherent causal explanation of biological plausibility, causal mediation confirmation, and refutation of non-causal explanations - and discuss to what extent it is now possible to automate discovery of manipulative causal dependencies and quantification of causal effects from observational data. We compare our proposed principles for causal discovery and inference to the traditional Bradford Hill considerations from 1965. Understanding how old and new principles are related can clarify and enrich both.
或许,在风险分析中,没有其他话题比如何从观察性数据中得出有效且正确的因果结论更具难度、更具争议性,也更具重要性。统计方法可以很容易地使用相对风险(RR)比、比值比(OR)、回归模型中暴露或处理变量的斜率系数,以及这些测量值导出的数量来量化观察变量之间的关联。流行病学教科书中解释了如何从相对风险(RR)比中计算人群归因分数、归因风险、疾病负担估计和因果概率。尽管它们的名称具有启示性,但如果它们所基于的关联是由偏倚、混杂、p 值操纵、巧合的历史趋势或其他非因果来源引起的,那么这些基于关联的测量值与因果关系没有必然联系。但是,政策分析师和决策者需要更多的东西:能够信任的预测——以及之后,对政策变量变化导致结果变化的评估。这种因果关系的概念与流行病学中最广泛使用的关联和归因因果关系概念不同。我们借鉴了关于因果发现和推理的现代文献,以及从观察性数据中得出有限但有用的因果结论的算法,提出了评估数据与操纵性因果暴露-反应关系一致性的七个标准——互信息、有向依赖、内部和外部一致性、对生物合理性的连贯因果解释、因果中介确认以及对非因果解释的反驳——并讨论了在多大程度上可以实现从观察性数据中自动发现操纵性因果关系和量化因果效应。我们将我们提出的因果发现和推理原则与 1965 年的传统布拉德福德·希尔(Bradford Hill)考虑因素进行了比较。了解新旧原则之间的关系可以澄清和丰富两者。