为避免多重填补中的偏差，需要适当纳入交互作用。

Appropriate inclusion of interactions was needed to avoid bias in multiple imputation.

作者信息

Tilling Kate, Williamson Elizabeth J, Spratt Michael, Sterne Jonathan A C, Carpenter James R

机构信息

School of Social and Community Medicine, University of Bristol, Canynge Hall, 39 Whatley Road, Bristol, BS8 2PS, UK.

Department of Medical Statistics, London School of Hygiene and Tropical Medicine, University of London, Keppel Street, London WC1E 7HT, UK; Farr Institute of Health Informatics, London University College London, 222 Euston Road, London NW1 2DA, UK.

出版信息

J Clin Epidemiol. 2016 Dec;80:107-115. doi: 10.1016/j.jclinepi.2016.07.004. Epub 2016 Jul 19.

DOI:10.1016/j.jclinepi.2016.07.004

PMID:27445178

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5176003/

Abstract

OBJECTIVE

Missing data are a pervasive problem, often leading to bias in complete records analysis (CRA). Multiple imputation (MI) via chained equations is one solution, but its use in the presence of interactions is not straightforward.

STUDY DESIGN AND SETTING

We simulated data with outcome Y dependent on binary explanatory variables X and Z and their interaction XZ. Six scenarios were simulated (Y continuous and binary, each with no interaction, a weak and a strong interaction), under five missing data mechanisms. We use directed acyclic graphs to identify when CRA and MI would each be unbiased. We evaluate the performance of CRA, MI without interactions, MI including all interactions, and stratified imputation. We also illustrated these methods using a simple example from the National Child Development Study (NCDS).

RESULTS

MI excluding interactions is invalid and resulted in biased estimates and low coverage. When XZ was zero, MI excluding interactions gave unbiased estimates but overcoverage. MI including interactions and stratified MI gave equivalent, valid inference in all cases. In the NCDS example, MI excluding interactions incorrectly concluded there was no evidence for an important interaction.

CONCLUSIONS

Epidemiologists carrying out MI should ensure that their imputation model(s) are compatible with their analysis model.

摘要

目的

缺失数据是一个普遍存在的问题，常常导致完整记录分析（CRA）出现偏差。通过链式方程进行多重填补（MI）是一种解决方案，但其在存在交互作用的情况下的应用并不简单。

研究设计与设置

我们模拟了数据，其中结局Y取决于二元解释变量X和Z及其交互作用XZ。在五种缺失数据机制下模拟了六种情景（Y为连续型和二元型，每种情景下有无交互作用、弱交互作用和强交互作用）。我们使用有向无环图来确定何时CRA和MI各自无偏差。我们评估了CRA、无交互作用的MI、包含所有交互作用的MI以及分层填补的性能。我们还使用了来自全国儿童发展研究（NCDS）的一个简单例子来说明这些方法。

结果

排除交互作用的MI无效，导致估计有偏差且覆盖度低。当XZ为零时，排除交互作用的MI给出无偏差估计，但覆盖度过高。包含交互作用的MI和分层MI在所有情况下都给出了等效的有效推断。在NCDS的例子中，排除交互作用的MI错误地得出没有证据支持重要交互作用的结论。

结论

进行MI的流行病学家应确保其填补模型与分析模型兼容。

相似文献

Appropriate inclusion of interactions was needed to avoid bias in multiple imputation.为避免多重填补中的偏差，需要适当纳入交互作用。

J Clin Epidemiol. 2016 Dec;80:107-115. doi: 10.1016/j.jclinepi.2016.07.004. Epub 2016 Jul 19.

Multiple imputation of missing data under missing at random: compatible imputation models are not sufficient to avoid bias if they are mis-specified.在随机缺失下对缺失数据进行多重插补：如果插补模型指定错误，即使相容的插补模型也不足以避免偏差。

J Clin Epidemiol. 2023 Aug;160:100-109. doi: 10.1016/j.jclinepi.2023.06.011. Epub 2023 Jun 19.

Is using multiple imputation better than complete case analysis for estimating a prevalence (risk) difference in randomized controlled trials when binary outcome observations are missing?在二元结局观察值缺失的情况下，对于估计随机对照试验中的患病率（风险）差异，使用多重填补法是否比完全病例分析法更好？

Trials. 2016 Jul 22;17:341. doi: 10.1186/s13063-016-1473-3.

Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome.评价在二分类结局病例-对照研究中采用多种插补方法处理协变量缺失信息的效果。

BMC Med Res Methodol. 2022 Apr 3;22(1):87. doi: 10.1186/s12874-021-01495-4.

Multiple imputation using auxiliary imputation variables that only predict missingness can increase bias due to data missing not at random.仅使用辅助预测缺失变量的多重插补可能会因数据缺失而增加偏差。

BMC Med Res Methodol. 2024 Oct 7;24(1):231. doi: 10.1186/s12874-024-02353-9.

Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study.预后建模研究中缺失协变量数据处理技术的比较：一项模拟研究。

BMC Med Res Methodol. 2010 Jan 19;10:7. doi: 10.1186/1471-2288-10-7.

A fair comparison of tree-based and parametric methods in multiple imputation by chained equations.基于树的方法和参数方法在链式方程多重插补中的公平比较。

Stat Med. 2020 Apr 15;39(8):1156-1166. doi: 10.1002/sim.8468. Epub 2020 Jan 29.

Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation.通过结合内部验证和多重填补来评估不完整数据中的预测性能。

BMC Med Res Methodol. 2016 Oct 26;16(1):144. doi: 10.1186/s12874-016-0239-7.

Missing data and imputation: a practical illustration in a prognostic study on low back pain.缺失数据与插补：腰痛预后研究中的实际例证

J Manipulative Physiol Ther. 2012 Jul;35(6):464-71. doi: 10.1016/j.jmpt.2012.07.002.

Multiple imputation for handling missing outcome data when estimating the relative risk.采用多重插补处理估计相对危险度时丢失的结局数据。

BMC Med Res Methodol. 2017 Sep 6;17(1):134. doi: 10.1186/s12874-017-0414-5.

引用本文的文献

Does concern regarding climate change impact subsequent mental health? A longitudinal analysis using data from the Avon Longitudinal Study of Parents and Children (ALSPAC).对气候变化的担忧会影响随后的心理健康吗？一项使用阿冯父母与儿童纵向研究（ALSPAC）数据的纵向分析。

R Soc Open Sci. 2025 Aug 6;12(8):251099. doi: 10.1098/rsos.251099. eCollection 2025 Aug.

Examining lifestyle factors as potential moderators of the link between childhood adversity and comorbid psychological distress and obesity in early adulthood.研究生活方式因素作为童年逆境与成年早期共病心理困扰和肥胖之间联系的潜在调节因素。

BMC Public Health. 2025 Jul 7;25(1):2403. doi: 10.1186/s12889-025-23505-6.

Incorporation of missing indicator with multiple imputation in propensity score analysis with partially observed covariates: A simulation study.在具有部分观测协变量的倾向得分分析中通过多重填补纳入缺失指标：一项模拟研究。

Stat Methods Med Res. 2025 Jul;34(7):1293-1302. doi: 10.1177/09622802251338365. Epub 2025 Jun 19.

Is university attendance associated with differences in health service use for a mental health problem in emerging adulthood? Evidence from the ALSPAC population-based cohort.在成年早期，上大学是否与心理健康问题的医疗服务使用差异有关？来自阿旺纵向父母与儿童研究（ALSPAC）人群队列的证据。

Soc Psychiatry Psychiatr Epidemiol. 2025 May 19. doi: 10.1007/s00127-025-02922-3.

Handling missing values in patient-reported outcome data in the presence of intercurrent events.在存在并发事件的情况下处理患者报告结局数据中的缺失值。

BMC Med Res Methodol. 2025 Mar 1;25(1):56. doi: 10.1186/s12874-025-02510-8.

Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study.考虑由于非随机缺失结局数据导致的偏倚：两种概率性偏倚分析方法的比较和说明：一项模拟研究。

BMC Med Res Methodol. 2024 Nov 13;24(1):278. doi: 10.1186/s12874-024-02382-4.

Causal inference in multi-cohort studies using the target trial framework to identify and minimize sources of bias.在多队列研究中使用目标试验框架进行因果推断，以识别并最小化偏倚来源。

Am J Epidemiol. 2024 Oct 23. doi: 10.1093/aje/kwae405.

Shifting norms, static behaviour: effects of dynamic norms on meat consumption.规范转变，行为不变：动态规范对肉类消费的影响

R Soc Open Sci. 2024 Jun 26;11(6):240407. doi: 10.1098/rsos.240407. eCollection 2024 Jun.

Healthy Prenatal Dietary Pattern and Offspring Autism.健康的产前饮食模式与后代自闭症。

JAMA Netw Open. 2024 Jul 1;7(7):e2422815. doi: 10.1001/jamanetworkopen.2024.22815.

Assessing treatment effect heterogeneity in the presence of missing effect modifier data in cluster-randomized trials.在整群随机试验中存在效应修饰因素数据缺失的情况下评估治疗效果异质性。

Stat Methods Med Res. 2024 May;33(5):909-927. doi: 10.1177/09622802241242323. Epub 2024 Apr 3.

本文引用的文献

Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.基于 MICE 使用随机森林和参数插补模型比较缺失数据插补：CALIBER 研究。

Am J Epidemiol. 2014 Mar 15;179(6):764-74. doi: 10.1093/aje/kwt312. Epub 2014 Jan 12.

Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods.缺失协变量的非线性效应和交互作用的多重插补：统计方法的评估。

BMC Med Res Methodol. 2012 Apr 10;12:46. doi: 10.1186/1471-2288-12-46.

Using causal diagrams to guide analysis in missing data problems.使用因果图指导缺失数据问题的分析。

Stat Methods Med Res. 2012 Jun;21(3):243-56. doi: 10.1177/0962280210394469. Epub 2011 Mar 9.

Multiple imputation using chained equations: Issues and guidance for practice.使用链式方程进行多重插补：实践中的问题和指导。

Stat Med. 2011 Feb 20;30(4):377-99. doi: 10.1002/sim.4067. Epub 2010 Nov 30.

Bias and efficiency of multiple imputation compared with complete-case analysis for missing covariate values.缺失协变量值的多重插补与完全案例分析相比的偏差和效率。

Stat Med. 2010 Dec 10;29(28):2920-31. doi: 10.1002/sim.3944.

Strategies for multiple imputation in longitudinal studies.纵向研究的多重插补策略。

Am J Epidemiol. 2010 Aug 15;172(4):478-87. doi: 10.1093/aje/kwq137. Epub 2010 Jul 8.

Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.流行病学和临床研究中缺失数据的多重填补：潜力与陷阱

BMJ. 2009 Jun 29;338:b2393. doi: 10.1136/bmj.b2393.

Use of multiple imputation in the epidemiologic literature.多重填补法在流行病学文献中的应用。

Am J Epidemiol. 2008 Aug 15;168(4):355-7. doi: 10.1093/aje/kwn071. Epub 2008 Jun 30.

Multiple imputation: current perspectives.多重填补：当前观点

Stat Methods Med Res. 2007 Jun;16(3):199-218. doi: 10.1177/0962280206075304.

Eliciting and using expert opinions about dropout bias in randomized controlled trials.征集并运用关于随机对照试验中失访偏倚的专家意见。

Clin Trials. 2007;4(2):125-39. doi: 10.1177/1740774507077849.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

为避免多重填补中的偏差，需要适当纳入交互作用。

Appropriate inclusion of interactions was needed to avoid bias in multiple imputation.

作者信息

机构信息

出版信息

OBJECTIVE

STUDY DESIGN AND SETTING

RESULTS

CONCLUSIONS

目的

研究设计与设置

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献