Leeds Institute for Data Analytics, University of Leeds, Leeds, UK.
School of Medicine, University of Leeds, Leeds, UK.
Int J Epidemiol. 2020 Aug 1;49(4):1307-1313. doi: 10.1093/ije/dyaa021.
Compositional data comprise the parts of some whole, for which all parts sum to that whole. They are prevalent in many epidemiological contexts. Although many of the challenges associated with analysing compositional data have been discussed previously, we do so within a formal causal framework by utilizing directed acyclic graphs (DAGs).
We depict compositional data using DAGs and identify two distinct effect estimands in the generic case: (i) the total effect, and (ii) the relative effect. We consider each in the context of three specific example scenarios involving compositional data: (1) the relationship between the economically active population and area-level gross domestic product; (2) the relationship between fat consumption and body weight; and (3) the relationship between time spent sedentary and body weight. For each, we consider the distinct interpretation of each effect, and the resulting implications for related analyses.
For scenarios (1) and (2), both the total and relative effects may be identifiable and causally meaningful, depending upon the specific question of interest. For scenario (3), only the relative effect is identifiable. In all scenarios, the relative effect represents a joint effect, and thus requires careful interpretation.
DAGs are useful for considering causal effects for compositional data. In all analyses involving compositional data, researchers should explicitly consider and declare which causal effect is sought and how it should be interpreted.
组成数据包含某些整体的部分,所有部分之和等于该整体。它们在许多流行病学背景下都很常见。尽管以前已经讨论了许多与分析组成数据相关的挑战,但我们通过利用有向无环图(DAG)在正式的因果框架内这样做。
我们使用 DAG 来描述组成数据,并在一般情况下确定两个不同的效应估计量:(i)总效应,和(ii)相对效应。我们在涉及组成数据的三个具体示例场景中考虑了每一个:(1)经济活动人口与地区生产总值之间的关系;(2)脂肪摄入量与体重之间的关系;(3)久坐时间与体重之间的关系。对于每个场景,我们考虑了每种效应的不同解释,以及对相关分析的影响。
对于场景(1)和(2),总效应和相对效应都可能是可识别的,并且具有因果意义,具体取决于感兴趣的特定问题。对于场景(3),只有相对效应是可识别的。在所有场景中,相对效应代表一个联合效应,因此需要仔细解释。
DAG 对于考虑组成数据的因果效应很有用。在涉及组成数据的所有分析中,研究人员应该明确考虑并声明所寻求的因果效应以及如何解释它。