Lehmann Thomas, Schlattmann Peter
Institute of Medical Statistics, Computer Sciences and Documentation, University Hospital Jena, Friedrich-Schiller-University, Bachstr. 18, 07743, Jena, Germany.
Biom J. 2017 Jan;59(1):159-171. doi: 10.1002/bimj.201500037. Epub 2016 Nov 2.
Multiple imputation has become a widely accepted technique to deal with the problem of incomplete data. Typically, imputation of missing values and the statistical analysis are performed separately. Therefore, the imputation model has to be consistent with the analysis model. If the data are analyzed with a mixture model, the parameter estimates are usually obtained iteratively. Thus, if the data are missing not at random, parameter estimation and treatment of missingness should be combined. We solve both problems by simultaneously imputing values using the data augmentation method and estimating parameters using the EM algorithm. This iterative procedure ensures that the missing values are properly imputed given the current parameter estimates. Properties of the parameter estimates were investigated in a simulation study. The results are illustrated using data from the National Health and Nutrition Examination Survey.
多重填补已成为处理数据不完整问题的一种广泛接受的技术。通常,缺失值的填补和统计分析是分别进行的。因此,填补模型必须与分析模型一致。如果使用混合模型对数据进行分析,参数估计通常是迭代获得的。因此,如果数据缺失并非随机,参数估计和缺失值处理应该结合起来。我们通过使用数据扩充方法同时填补值并使用期望最大化(EM)算法估计参数来解决这两个问题。这个迭代过程确保在给定当前参数估计的情况下正确地填补缺失值。在一项模拟研究中对参数估计的性质进行了研究。使用来自美国国家健康和营养检查调查的数据说明了结果。