Inoue Kosuke, Sakamaki Kentaro, Komukai Sho, Ito Yuri, Goto Atsushi, Shinozaki Tomohiro
Department of Social Epidemiology, Graduate School of Medicine, Kyoto University.
Hakubi Center for Advanced Research, Kyoto University.
J Epidemiol. 2025 Jan 5;35(1):3-10. doi: 10.2188/jea.JE20240082. Epub 2024 Nov 30.
In observational studies, identifying and adjusting for a sufficient set of confounders is crucial for accurately estimating the causal effect of the exposure on the outcome. Even in studies with large sample sizes, which typically benefit from small variances in estimates, there is a risk of producing estimates that are precisely inaccurate if the study suffers from systematic errors or biases, including confounding bias. To date, several approaches have been developed for selecting confounders. In this article, we first summarize the epidemiological and statistical approaches to identifying a sufficient set of confounders. Particularly, we introduce the modified disjunctive cause criterion as one of the most useful approaches, which involves controlling for any pre-exposure covariate that affects the exposure, outcome, or both. It then excludes instrumental variables but includes proxies for the shared common cause of exposure and outcome. Statistical confounder selection is also useful when dealing with a large number of covariates, even in studies with small sample sizes. After introducing several approaches, we discuss some pitfalls and considerations in confounder selection, such as the adjustment for instrumental variables, intermediate variables, and baseline outcome variables. Lastly, as it is often difficult to comprehensively measure key confounders, we introduce two statistics, E-value and robustness value, for assessing sensitivity to unmeasured confounders. Illustrated examples are provided using the National Health and Nutritional Examination Survey Epidemiologic Follow-up Study. Integrating these principles and approaches will enhance our understanding of confounder selection and facilitate better reporting and interpretation of future epidemiological studies.
在观察性研究中,识别并调整一组合适的混杂因素对于准确估计暴露因素对结局的因果效应至关重要。即使在样本量较大的研究中(这类研究通常受益于估计值的小方差),如果研究存在系统误差或偏差(包括混杂偏倚),仍有可能得出精确但不准确的估计值。迄今为止,已经开发了几种选择混杂因素的方法。在本文中,我们首先总结识别一组合适混杂因素的流行病学和统计学方法。特别地,我们介绍修正的析取病因标准,它是最有用的方法之一,涉及控制任何影响暴露因素、结局或两者的暴露前协变量。然后它排除工具变量,但包括暴露因素和结局的共同原因的替代变量。在处理大量协变量时,即使是在小样本量的研究中,统计混杂因素选择也很有用。在介绍了几种方法之后,我们讨论混杂因素选择中的一些陷阱和注意事项,例如对工具变量、中间变量和基线结局变量的调整。最后,由于通常难以全面测量关键混杂因素,我们引入两个统计量,E值和稳健性值,用于评估对未测量混杂因素的敏感性。使用国家健康与营养检查调查流行病学随访研究提供了示例说明。整合这些原则和方法将增进我们对混杂因素选择的理解,并有助于更好地报告和解释未来的流行病学研究。