Suppr超能文献

公共卫生中多元缺失数据插补方法的实证比较。

Empirical Comparison of Imputation Methods for Multivariate Missing Data in Public Health.

机构信息

Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, 801 NE 13th St, Oklahoma City, OK 73104, USA.

出版信息

Int J Environ Res Public Health. 2023 Jan 14;20(2):1524. doi: 10.3390/ijerph20021524.

Abstract

Sample estimates derived from data with missing values may be unreliable and may negatively impact the inferences that researchers make about the underlying population due to nonresponse bias. As a result, imputation is often preferred to listwise deletion in handling multivariate missing data. In this study, we compared three popular imputation methods: sequential multiple imputation, fractional hot-deck imputation, and generalized efficient regression-based imputation with latent processes for handling multivariate missingness under different missing patterns by conducting descriptive and regression analyses on the imputed data and seeing how the estimates differ from those generated from the full sample. Limited Monte Carlo simulation results by using the National Health Nutrition and Examination Survey and Behavioral Risk Factor Surveillance System are presented to demonstrate the effect of each imputation method on reducing bias and increasing efficiency for the parameter estimate of interest for that particular incomplete variable. Although these three methods did not always outperform listwise deletion in our simulated missing patterns, they improved many descriptive and regression estimates when used to impute all incomplete variables at once.

摘要

样本估计值来源于存在缺失值的数据可能不可靠,并且由于无应答偏差,可能会对研究人员对基础人群的推论产生负面影响。因此,在处理多变量缺失数据时,插补通常比完全删除更受欢迎。在这项研究中,我们通过对插补数据进行描述性和回归分析,并比较了插补数据与全样本生成的估计值之间的差异,比较了三种流行的插补方法:顺序多重插补、分数热插补和基于潜在过程的广义有效回归插补,以处理不同缺失模式下的多变量缺失。通过使用国家健康营养与体检调查和行为风险因素监测系统进行有限的蒙特卡罗模拟结果,展示了每种插补方法对减少感兴趣参数估计偏差和提高效率的影响,针对特定的不完全变量。虽然在我们模拟的缺失模式下,这三种方法并不总是优于完全删除,但当同时用于插补所有不完全变量时,它们改善了许多描述性和回归估计。

相似文献

2
Multiple imputation with missing data indicators.带有缺失数据指标的多重插补。
Stat Methods Med Res. 2021 Dec;30(12):2685-2700. doi: 10.1177/09622802211047346. Epub 2021 Oct 13.

引用本文的文献

本文引用的文献

2
A Benchmark for Data Imputation Methods.数据插补方法的一个基准。
Front Big Data. 2021 Jul 8;4:693674. doi: 10.3389/fdata.2021.693674. eCollection 2021.
8
Compatibility of conditionally specified models.条件指定模型的兼容性。
Stat Probab Lett. 2010 Apr 1;80(7-8):670-677. doi: 10.1016/j.spl.2009.12.025.
9
Review: a gentle introduction to imputation of missing values.综述:缺失值插补的简要介绍
J Clin Epidemiol. 2006 Oct;59(10):1087-91. doi: 10.1016/j.jclinepi.2006.01.014. Epub 2006 Jul 11.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验