Vidotto Davide, Vermunt Jeroen K, Van Deun Katrijn
Department of Methodology and Statistics, Tilburg University, Tilburg, Netherlands.
J Appl Stat. 2019 Nov 24;47(10):1720-1738. doi: 10.1080/02664763.2019.1692794. eCollection 2020.
Standard latent class modeling has recently been shown to provide a flexible tool for the multiple imputation (MI) of missing categorical covariates in cross-sectional studies. This article introduces an analogous tool for longitudinal studies: MI using Bayesian mixture Latent Markov (BMLM) models. Besides retaining the benefits of latent class models, i.e. respecting the (categorical) measurement scale of the variables and preserving possibly complex relationships between variables within a measurement occasion, the Markov dependence structure of the proposed BMLM model allows capturing lagged dependencies between adjacent time points, while the time-constant mixture structure allows capturing dependencies across all time points, as well as retrieving associations between time-varying and time-constant variables. The performance of the BMLM model for MI is evaluated by means of a simulation study and an empirical experiment, in which it is compared with complete case analysis and MICE. Results show good performance of the proposed method in retrieving the parameters of the analysis model. In contrast, competing methods could provide correct estimates only for some aspects of the data.
标准潜在类别建模最近已被证明是一种灵活的工具,可用于横断面研究中缺失分类协变量的多重填补(MI)。本文介绍了一种适用于纵向研究的类似工具:使用贝叶斯混合潜在马尔可夫(BMLM)模型的多重填补。除了保留潜在类别模型的优点,即尊重变量的(分类)测量尺度并保留测量场合内变量之间可能复杂的关系外,所提出的BMLM模型的马尔可夫依赖结构允许捕捉相邻时间点之间的滞后依赖性,而时间常数混合结构允许捕捉所有时间点之间的依赖性,以及检索随时间变化和时间常数变量之间的关联。通过模拟研究和实证实验评估了BMLM模型用于多重填补的性能,并将其与完整病例分析和MICE进行了比较。结果表明,所提出的方法在检索分析模型参数方面表现良好。相比之下,竞争方法仅能对数据的某些方面提供正确的估计。