• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

处理缺失数据的插补技术比较。

A comparison of imputation techniques for handling missing data.

作者信息

Musil Carol M, Warner Camille B, Yobas Piyanee Klainin, Jones Susan L

机构信息

Frances Payne Bolton School of Nursing, Case Western Reserve University, Cleveland, Ohio, USA.

出版信息

West J Nurs Res. 2002 Nov;24(7):815-29. doi: 10.1177/019394502762477004.

DOI:10.1177/019394502762477004
PMID:12428897
Abstract

Researchers are commonly faced with the problem of missing data. This article presents theoretical and empirical information for the selection and application of approaches for handling missing data on a single variable. An actual data set of 492 cases with no missing values was used to create a simulated yet realistic data set with missing at random (MAR) data. The authors compare and contrast five approaches (listwise deletion, mean substitution, simple regression, regression with an error term, and the expectation maximization [EM] algorithm) for dealing with missing data, and compare the effects of each method on descriptive statistics and correlation coefficients for the imputed data (n = 96) and the entire sample (n = 492) when imputed data are inculded. All methods had limitations, although our findings suggest that mean substitution was the least effective and that regression with an error term and the EM algorithm produced estimates closest to those of the original variables.

摘要

研究人员通常会面临数据缺失的问题。本文提供了关于处理单个变量缺失数据方法的选择与应用的理论和实证信息。使用一个包含492个无缺失值案例的实际数据集来创建一个具有随机缺失(MAR)数据的模拟但现实的数据集。作者比较并对比了五种处理缺失数据的方法(删除列表法、均值替换法、简单回归法、带误差项的回归法以及期望最大化[EM]算法),并在纳入插补数据时,比较了每种方法对插补数据(n = 96)和整个样本(n = 492)的描述性统计量和相关系数的影响。所有方法都有局限性,不过我们的研究结果表明,均值替换法效果最差,而带误差项的回归法和EM算法得出的估计值最接近原始变量的估计值。

相似文献

1
A comparison of imputation techniques for handling missing data.处理缺失数据的插补技术比较。
West J Nurs Res. 2002 Nov;24(7):815-29. doi: 10.1177/019394502762477004.
2
Dealing with missing data in a multi-question depression scale: a comparison of imputation methods.处理多问题抑郁量表中的缺失数据:插补方法比较
BMC Med Res Methodol. 2006 Dec 13;6:57. doi: 10.1186/1471-2288-6-57.
3
Missing data: an introductory conceptual overview for the novice researcher.缺失数据:面向新手研究者的概念性入门概述
Can J Nurs Res. 2005 Dec;37(4):156-71.
4
Multiple imputation: dealing with missing data.多重插补:处理缺失数据。
Nephrol Dial Transplant. 2013 Oct;28(10):2415-20. doi: 10.1093/ndt/gft221. Epub 2013 May 31.
5
Methods for Handling Missing Data in the Behavioral Neurosciences: Don't Throw the Baby Rat out with the Bath Water.行为神经科学中处理缺失数据的方法:勿因洗澡水而倒掉鼠宝宝。
J Undergrad Neurosci Educ. 2007 Spring;5(2):A71-7. Epub 2007 Jun 15.
6
A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study.存在与时间呈非线性关联的时变协变量时,用于处理纵向数据中缺失值的多种多重填补方法的比较:一项模拟研究。
BMC Med Res Methodol. 2017 Jul 25;17(1):114. doi: 10.1186/s12874-017-0372-y.
7
Missing data imputation via the expectation-maximization algorithm can improve principal component analysis aimed at deriving biomarker profiles and dietary patterns.通过期望最大化算法进行缺失数据插补可以改进主成分分析,以得出生物标志物图谱和饮食模式。
Nutr Res. 2020 Mar;75:67-76. doi: 10.1016/j.nutres.2020.01.001. Epub 2020 Jan 9.
8
An empirical comparison of some missing data treatments in PLS-SEM.PLS-SEM 中一些缺失数据处理方法的实证比较。
PLoS One. 2024 Jan 19;19(1):e0297037. doi: 10.1371/journal.pone.0297037. eCollection 2024.
9
[Imputation methods for missing data in educational diagnostic evaluation].[教育诊断评估中缺失数据的插补方法]
Psicothema. 2012 Feb;24(1):167-75.
10
Missing data in bioarchaeology II: A test of ordinal and continuous data imputation.生物考古学中的缺失数据 II:有序数据和连续数据插补的检验。
Am J Biol Anthropol. 2022 Nov;179(3):349-364. doi: 10.1002/ajpa.24614. Epub 2022 Sep 12.

引用本文的文献

1
Translation, reliability, and validation of the Dutch Safe Use of Mobility Aid Checklist (SUMAC-NL) for walker use in people living with dementia.荷兰痴呆症患者使用助行器安全使用清单(SUMAC-NL)的翻译、信度和效度
F1000Res. 2025 Jun 30;12:1150. doi: 10.12688/f1000research.132762.3. eCollection 2023.
2
Flexible imputation toolkit for electronic health records.用于电子健康记录的灵活插补工具包。
Sci Rep. 2025 May 17;15(1):17176. doi: 10.1038/s41598-025-02276-5.
3
Parental Feeding Practices, Weight Perception, and Children's Appetitive Traits Are Associated with Weight Trajectories in Preschoolers: A Longitudinal Study in China.
父母喂养方式、体重认知和儿童食欲特征与学龄前儿童体重轨迹相关:中国的一项纵向研究。
Nutrients. 2024 Oct 31;16(21):3746. doi: 10.3390/nu16213746.
4
The impact of misclassifications and outliers on imputation methods.错误分类和异常值对插补方法的影响。
J Appl Stat. 2024 Mar 5;51(14):2894-2928. doi: 10.1080/02664763.2024.2325969. eCollection 2024.
5
Limited generalizability of multivariate brain-based dimensions of child psychiatric symptoms.儿童精神症状基于大脑的多变量维度的普遍适用性有限。
Commun Psychol. 2024 Feb 28;2(1):16. doi: 10.1038/s44271-024-00063-y.
6
Use of Sequential Hot-Deck Imputation for Missing Health Care Systems Data for Population Health Research.利用连续热屉插补法填补人口健康研究中医疗保健系统数据的缺失。
Med Care. 2024 May 1;62(5):319-325. doi: 10.1097/MLR.0000000000001995. Epub 2024 Mar 28.
7
Examining specific and non-specific symptoms of the best-fitting posttraumatic stress disorder model in conflict-exposed adolescents.检查创伤后应激障碍最佳拟合模型在经历冲突的青少年中的特定和非特定症状。
BMC Psychol. 2023 Oct 24;11(1):353. doi: 10.1186/s40359-023-01389-8.
8
Multiple imputation using chained equations for missing data in survival models: applied to multidrug-resistant tuberculosis and HIV data.生存模型中使用链式方程对缺失数据进行多重填补:应用于耐多药结核病和艾滋病毒数据
J Public Health Afr. 2023 Jun 5;14(8):2388. doi: 10.4081/jphia.2023.2388. eCollection 2023 Aug 7.
9
Medical prediction from missing data with max-minus negative regularized dropout.基于最大负正则化随机失活的缺失数据医学预测
Front Neurosci. 2023 Jul 13;17:1221970. doi: 10.3389/fnins.2023.1221970. eCollection 2023.
10
Measuring adherence to inhaled control medication in patients with asthma: Comparison among an asthma app, patient self-report and physician assessment.测量哮喘患者吸入控制药物的依从性:哮喘应用程序、患者自我报告和医生评估之间的比较。
Clin Transl Allergy. 2023 Feb;13(2):e12210. doi: 10.1002/clt2.12210.