离散数据的多重填补：联合潜在正态模型的评估

Multiple imputation for discrete data: Evaluation of the joint latent normal model.

作者信息

Quartagno Matteo, Carpenter James R

机构信息

Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK.

MRC Clinical Trials Unit at UCL, 90 High Holborn, London, UK.

出版信息

Biom J. 2019 Jul;61(4):1003-1019. doi: 10.1002/bimj.201800222. Epub 2019 Mar 14.

DOI:10.1002/bimj.201800222

PMID:30868652

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6618333/

Abstract

Missing data are ubiquitous in clinical and social research, and multiple imputation (MI) is increasingly the methodology of choice for practitioners. Two principal strategies for imputation have been proposed in the literature: joint modelling multiple imputation (JM-MI) and full conditional specification multiple imputation (FCS-MI). While JM-MI is arguably a preferable approach, because it involves specification of an explicit imputation model, FCS-MI is pragmatically appealing, because of its flexibility in handling different types of variables. JM-MI has developed from the multivariate normal model, and latent normal variables have been proposed as a natural way to extend this model to handle categorical variables. In this article, we evaluate the latent normal model through an extensive simulation study and an application on data from the German Breast Cancer Study Group, comparing the results with FCS-MI. We divide our investigation in four sections, focusing on (i) binary, (ii) categorical, (iii) ordinal, and (iv) count data. Using data simulated from both the latent normal model and the general location model, we find that in all but one extreme general location model setting JM-MI works very well, and sometimes outperforms FCS-MI. We conclude the latent normal model, implemented in the R package jomo, can be used with confidence by researchers, both for single and multilevel multiple imputation.

摘要

缺失数据在临床和社会研究中普遍存在，多重填补（MI）越来越成为从业者的首选方法。文献中提出了两种主要的填补策略：联合建模多重填补（JM-MI）和完全条件设定多重填补（FCS-MI）。虽然JM-MI可以说是一种更可取的方法，因为它涉及明确的填补模型设定，但FCS-MI在实际应用中很有吸引力，因为它在处理不同类型变量方面具有灵活性。JM-MI是从多元正态模型发展而来的，潜在正态变量已被提出作为将该模型扩展以处理分类变量的自然方式。在本文中，我们通过广泛的模拟研究和对德国乳腺癌研究组数据的应用来评估潜在正态模型，并将结果与FCS-MI进行比较。我们将研究分为四个部分，重点关注（i）二元数据、（ii）分类数据、（iii）有序数据和（iv）计数数据。使用从潜在正态模型和一般位置模型模拟的数据，我们发现除了一种极端的一般位置模型设置外，在所有情况下JM-MI都表现得非常好，有时甚至优于FCS-MI。我们得出结论，在R包jomo中实现的潜在正态模型可供研究人员放心使用，无论是用于单级还是多级多重填补。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2ed/6618333/bdc52d12486a/BIMJ-61-1003-g001.jpg

相似文献

Multiple imputation for discrete data: Evaluation of the joint latent normal model.离散数据的多重填补：联合潜在正态模型的评估

Biom J. 2019 Jul;61(4):1003-1019. doi: 10.1002/bimj.201800222. Epub 2019 Mar 14.

Multiple imputation methods for missing multilevel ordinal outcomes.缺失多水平有序结局的多重插补方法。

BMC Med Res Methodol. 2023 May 9;23(1):112. doi: 10.1186/s12874-023-01909-5.

Handling missing data in matched case-control studies using multiple imputation.使用多重填补法处理配对病例对照研究中的缺失数据。

Biometrics. 2015 Dec;71(4):1150-9. doi: 10.1111/biom.12358. Epub 2015 Aug 3.

Review and evaluation of imputation methods for multivariate longitudinal data with mixed-type incomplete variables.多元纵向混合缺失数据插补方法的评价与研究

Stat Med. 2022 Dec 30;41(30):5844-5876. doi: 10.1002/sim.9592. Epub 2022 Oct 11.

Multiple imputation in the presence of an incomplete binary variable created from an underlying continuous variable.在存在由潜在连续变量创建的不完整二元变量的情况下进行多重填补。

Biom J. 2020 Mar;62(2):467-478. doi: 10.1002/bimj.201900011. Epub 2019 Jul 15.

A comparison of multiple imputation methods for missing data in longitudinal studies.纵向研究中缺失数据的多种插补方法比较。

BMC Med Res Methodol. 2018 Dec 12;18(1):168. doi: 10.1186/s12874-018-0615-6.

Rounding strategies for multiply imputed binary data.多重填补二元数据的舍入策略。

Biom J. 2009 Aug;51(4):677-88. doi: 10.1002/bimj.200900018.

A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study.存在与时间呈非线性关联的时变协变量时，用于处理纵向数据中缺失值的多种多重填补方法的比较：一项模拟研究。

BMC Med Res Methodol. 2017 Jul 25;17(1):114. doi: 10.1186/s12874-017-0372-y.

Multiple Imputation in Multilevel Models. A Revision of the Current Software and Usage Examples for Researchers.多水平模型中的多重插补。对当前软件的修订和研究人员的使用示例。

Span J Psychol. 2020 Nov 12;23:e46. doi: 10.1017/SJP.2020.48.

Multiple imputation of discrete and continuous data by fully conditional specification.通过完全条件设定对离散和连续数据进行多重填补

Stat Methods Med Res. 2007 Jun;16(3):219-42. doi: 10.1177/0962280206074463.

引用本文的文献

Conceptual framework as a guide to choose appropriate imputation method for missing values in a clinical structured dataset.概念框架作为选择临床结构化数据集中缺失值的适当插补方法的指南。

BMC Med Res Methodol. 2025 Feb 20;25(1):43. doi: 10.1186/s12874-025-02496-3.

Reference-Based Multiple Imputation for Longitudinal Binary Data.纵向二元数据的基于参考的多重填补

Stat Med. 2025 Feb 10;44(3-4):e10301. doi: 10.1002/sim.10301.

Multiple Imputation for Longitudinal Data: A Tutorial.纵向数据的多重填补：教程

Stat Med. 2025 Feb 10;44(3-4):e10274. doi: 10.1002/sim.10274.

Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review.识别处理临床结构化数据集缺失值的最合适插补方法：系统评价。

BMC Med Res Methodol. 2024 Aug 28;24(1):188. doi: 10.1186/s12874-024-02310-6.

A novel machine learning-based imputation strategy for missing data in step-stress accelerated degradation test.一种基于机器学习的用于步加应力加速退化试验中缺失数据的插补策略。

Heliyon. 2024 Feb 18;10(4):e26429. doi: 10.1016/j.heliyon.2024.e26429. eCollection 2024 Feb 29.

The stroke transitional care intervention for older adults with stroke and multimorbidity: a multisite pragmatic randomized controlled trial.老年卒中合并多种共病患者的卒中过渡期护理干预：一项多中心实用随机对照试验。

BMC Geriatr. 2023 Oct 24;23(1):687. doi: 10.1186/s12877-023-04403-1.

Two-stage or not two-stage? That is the question for IPD meta-analysis projects.两阶段还是不两阶段？这是 IPD 荟萃分析项目的问题。

Res Synth Methods. 2023 Nov;14(6):903-910. doi: 10.1002/jrsm.1661. Epub 2023 Aug 22.

Multiple imputation methods for missing multilevel ordinal outcomes.缺失多水平有序结局的多重插补方法。

BMC Med Res Methodol. 2023 May 9;23(1):112. doi: 10.1186/s12874-023-01909-5.

Real-time imputation of missing predictor values in clinical practice.临床实践中缺失预测值的实时插补

Eur Heart J Digit Health. 2020 Dec 19;2(1):154-164. doi: 10.1093/ehjdh/ztaa016. eCollection 2021 Mar.

Assessing Alternative Imputation Strategies for Infrequently Missing Items on Multi-item Scales.评估多项目量表中缺失情况不常见项目的替代插补策略。

Commun Stat Case Stud Data Anal Appl. 2022;8(4):682-713. doi: 10.1080/23737484.2022.2115430. Epub 2022 Sep 1.

本文引用的文献

Multiple imputation in Cox regression when there are time-varying effects of covariates.在协变量的时变效应存在时，Cox 回归中的多重插补。

Stat Med. 2018 Nov 10;37(25):3661-3678. doi: 10.1002/sim.7842. Epub 2018 Jul 16.

Multiple imputation in the presence of non-normal data.非正态数据情况下的多重填补

Stat Med. 2017 Feb 20;36(4):606-617. doi: 10.1002/sim.7173. Epub 2016 Nov 15.

A Comparison of Imputation Strategies for Ordinal Missing Data on Likert Scale Variables.李克特量表变量中有序缺失数据的插补策略比较

Multivariate Behav Res. 2015;50(5):484-503. doi: 10.1080/00273171.2015.1022644. Epub 2015 Jul 24.

Joint modelling rationale for chained equations.联立方程的联合建模原理。

BMC Med Res Methodol. 2014 Feb 21;14:28. doi: 10.1186/1471-2288-14-28.

Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model.通过完全条件设定对协变量进行多重填补：适配实质性模型。

Stat Methods Med Res. 2015 Aug;24(4):462-87. doi: 10.1177/0962280214521348. Epub 2014 Feb 12.

Imputing missing covariate values for the Cox model.为Cox模型估算缺失的协变量值。

Stat Med. 2009 Jul 10;28(15):1982-98. doi: 10.1002/sim.3618.

Bayesian Analysis of Multivariate Nominal Measures Using Multivariate Multinomial Probit Models.使用多元多项概率单位模型对多元名义测度进行贝叶斯分析。

Comput Stat Data Anal. 2008 Mar 15;52(7):3697-3708. doi: 10.1016/j.csda.2007.12.012.

Modelling the effects of standard prognostic factors in node-positive breast cancer. German Breast Cancer Study Group (GBSG).模拟标准预后因素对淋巴结阳性乳腺癌的影响。德国乳腺癌研究组（GBSG）。

Br J Cancer. 1999 Apr;79(11-12):1752-60. doi: 10.1038/sj.bjc.6690279.

Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group.评估激素治疗及化疗时长对淋巴结阳性乳腺癌患者影响的随机2×2试验。德国乳腺癌研究组。

J Clin Oncol. 1994 Oct;12(10):2086-93. doi: 10.1200/JCO.1994.12.10.2086.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

离散数据的多重填补：联合潜在正态模型的评估

Multiple imputation for discrete data: Evaluation of the joint latent normal model.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献