Department of Psychology, Faculty of Arts and Social Sciences, National University of Singapore, Block AS4, Level 2, 9 Arts Link, Singapore, 117570, Singapore.
Behav Res Methods. 2022 Jun;54(3):1063-1077. doi: 10.3758/s13428-021-01582-w. Epub 2021 Sep 20.
Missing data is a common occurrence in confirmatory factor analysis (CFA). Much work had evaluated the performance of different techniques when all observed variables were either continuous or ordinal. However, few have investigated these techniques when observed variables are a mix of continuous and ordinal variables. This study investigated the performance of four approaches to handling missing data in these models: a joint ordinal-continuous full information maximum likelihood (FIML) approach and three multiple imputation approaches (fully conditional specification, fully conditional specification with latent variable formulation, and expectation-maximization with bootstrapping) combined with the weighted least squares with mean and variance adjustment (WLSMV) estimator. In a Monte-Carlo simulation, the FIML approach produced unbiased estimations of factor loadings and standard errors in almost all conditions. Fully conditional specification combined with WLSMV was second best, producing accurate estimates when the sample size was large. However, FIML encountered slight non-convergence issues when certain ordinal categories have extremely low frequencies, which is typical of skewed data. If the sample is large, fully conditional specification combined with weighted least squares is recommended when the FIML approach is not feasible (e.g., non-convergence, impractical computation durations, and variables that predict missingness are not of interest to the analysis).
缺失数据在验证性因素分析(CFA)中很常见。许多研究已经评估了在所有观察变量均为连续或有序变量时,不同技术的性能。然而,当观察变量既有连续变量又有序变量时,很少有研究调查这些技术。本研究调查了在这些模型中处理缺失数据的四种方法的性能:联合有序连续完全信息极大似然(FIML)方法和三种多重插补方法(完全条件指定、完全条件指定与潜在变量形式、期望最大化与引导)与加权最小二乘均值和方差调整(WLSMV)估计器相结合。在蒙特卡罗模拟中,FIML 方法在几乎所有条件下都产生了因子载荷和标准误差的无偏估计。完全条件指定与 WLSMV 相结合的方法排名第二,当样本量较大时,能够产生准确的估计。然而,当某些有序类别具有极低的频率时,FIML 会遇到轻微的不收敛问题,这在偏态数据中很常见。如果样本较大,当 FIML 方法不可行时(例如,不收敛、不切实际的计算时间、以及对分析没有兴趣的缺失预测变量),建议使用完全条件指定与加权最小二乘相结合。