Suppr超能文献

违反分布条件时用于缺失数据的极大似然估计与极大信息估计对比

ML versus MI for Missing Data with Violation of Distribution Conditions.

作者信息

Yuan Ke-Hai, Yang-Wallentin Fan, Bentler Peter M

机构信息

University of Notre Dame.

Uppsala University, Sweden.

出版信息

Sociol Methods Res. 2012 Nov;41(4):598-629. doi: 10.1177/0049124112460373.

Abstract

Normal-distribution-based maximum likelihood (ML) and multiple imputation (MI) are the two major procedures for missing data analysis. This article compares the two procedures with respects to bias and efficiency of parameter estimates. It also compares formula-based standard errors (SEs) for each procedure against the corresponding empirical SEs. The results indicate that parameter estimates by MI tend to be less efficient than those by ML; and the estimates of variance-covariance parameters by MI are also more biased. In particular, when the population for the observed variables possesses heavy tails, estimates of variance-covariance parameters by MI may contain severe bias even at relative large sample sizes. Although performing a lot better, ML parameter estimates may also contain substantial bias at smaller sample sizes. The results also indicate that, when the underlying population is close to normally distributed, SEs based on the sandwich-type covariance matrix and those based on the observed information matrix are very comparable to empirical SEs with either ML or MI. When the underlying distribution has heavier tails, SEs based on the sandwich-type covariance matrix for ML estimates are more reliable than those based on the observed information matrix. Both empirical results and analysis show that neither SEs based on the observed information matrix nor those based on the sandwich-type covariance matrix can provide consistent SEs in MI. Thus, ML is preferable to MI in practice, although parameter estimates by MI might still be consistent.

摘要

基于正态分布的极大似然法(ML)和多重填补法(MI)是缺失数据分析的两种主要方法。本文比较了这两种方法在参数估计偏差和效率方面的差异。同时,还将每种方法基于公式的标准误(SEs)与相应的经验标准误进行了比较。结果表明,MI的参数估计往往不如ML有效;而且MI对方差 - 协方差参数的估计偏差也更大。特别是,当观测变量的总体具有厚尾分布时,即使在相对大的样本量下,MI对方差 - 协方差参数的估计也可能存在严重偏差。虽然ML的参数估计表现要好得多,但在较小样本量时也可能存在较大偏差。结果还表明,当基础总体接近正态分布时,基于三明治型协方差矩阵的标准误和基于观测信息矩阵的标准误与ML或MI的经验标准误非常接近。当基础分布具有更厚的尾部时,基于三明治型协方差矩阵的ML估计标准误比基于观测信息矩阵的标准误更可靠。实证结果和分析均表明,在MI中,基于观测信息矩阵的标准误和基于三明治型协方差矩阵的标准误都不能提供一致的标准误。因此,在实际应用中,ML优于MI,尽管MI的参数估计可能仍然是一致的。

相似文献

6
Information matrix estimation procedures for cognitive diagnostic models.认知诊断模型的信息矩阵估计程序。
Br J Math Stat Psychol. 2019 Feb;72(1):18-37. doi: 10.1111/bmsp.12134. Epub 2018 Mar 6.

引用本文的文献

3
Childhood experiences and frailty trajectory among middle-aged and older adults in China.中国中老年人群的童年经历与衰弱轨迹
Eur J Ageing. 2022 Nov 24;19(4):1601-1615. doi: 10.1007/s10433-022-00746-7. eCollection 2022 Dec.

本文引用的文献

6
Multiple imputation: current perspectives.多重填补:当前观点
Stat Methods Med Res. 2007 Jun;16(3):199-218. doi: 10.1177/0962280206075304.
10
Methods for addressing missing data in psychiatric and developmental research.精神科与发育研究中处理缺失数据的方法。
J Am Acad Child Adolesc Psychiatry. 2005 Dec;44(12):1230-40. doi: 10.1097/01.chi.0000181044.06337.6f.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验