Department of Statistics and Operations Research, Tel Aviv University, Tel Aviv, Israel.
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
Int J Epidemiol. 2021 Jul 9;50(3):1030-1037. doi: 10.1093/ije/dyaa278.
Molecular pathological epidemiology research provides information about pathogenic mechanisms. A common study goal is to evaluate whether the effects of risk factors on disease incidence vary between different disease subtypes. A popular approach to carrying out this type of research is to implement a multinomial regression in which each of the non-zero values corresponds to a bona fide disease subtype. Then, heterogeneity in the exposure effects across subtypes is examined by comparing the coefficients of the exposure between the different subtypes. In this paper, we explain why this common method potentially cannot recover causal effects, even when all confounders are measured, due to a particular type of selection bias. This bias can be explained by recognizing that the multinomial regression is equivalent to a series of logistic regressions; each compares cases of a certain subtype to the controls. We further explain how this bias arises using directed acyclic graphs and we demonstrate the potential magnitude of the bias by analysis of a hypothetical data set and by a simulation study.
分子病理流行病学研究提供了有关发病机制的信息。一个常见的研究目标是评估危险因素对疾病发病率的影响是否在不同疾病亚型之间存在差异。一种常用的方法是实施多分类回归,其中每个非零值对应于一个真正的疾病亚型。然后,通过比较不同亚型之间暴露因素的系数,来检查暴露因素在不同亚型之间的效应异质性。在本文中,我们解释了为什么即使所有混杂因素都被测量了,由于特定类型的选择偏差,这种常见的方法也可能无法恢复因果效应。这种偏差可以通过认识到多分类回归等同于一系列逻辑回归来解释;每个回归都将特定亚型的病例与对照组进行比较。我们进一步使用有向无环图解释了这种偏差是如何产生的,并用假设数据集的分析和模拟研究展示了这种偏差的潜在程度。