School of Psychology, Pontificia Universidad Católica Madre y Maestra, Santiago de los Caballeros, Dominican Republic.
Department of Psychology, University of California, Merced, California, United States of America.
PLoS One. 2020 Apr 17;15(4):e0231525. doi: 10.1371/journal.pone.0231525. eCollection 2020.
Growth Mixture Modeling (GMM) has gained great popularity in the last decades as a methodology for longitudinal data analysis. The usual assumption of normally distributed repeated measures has been shown as problematic in real-life data applications. Namely, performing normal GMM on data that is even slightly skewed can lead to an over selection of the number of latent classes. In order to ameliorate this unwanted result, GMM based on the skew t family of continuous distributions has been proposed. This family of distributions includes the normal, skew normal, t, and skew t. This simulation study aims to determine the efficiency of selecting the "true" number of latent groups in GMM based on the skew t family of continuous distributions, using fit indices and likelihood ratio tests. Results show that the skew t GMM was the only model considered that showed fit indices and LRT false positive rates under the 0.05 cutoff value across sample sizes and for normal, and skewed and kurtic data. Simulation results are corroborated by a real educational data application example. These findings favor the development of practical guides of the benefits and risks of using the GMM based on this family of distributions.
增长混合模型 (GMM) 在过去几十年中作为一种纵向数据分析方法得到了广泛的应用。通常假设重复测量的正态分布在实际数据应用中存在问题。即,对稍微偏斜的数据执行常规 GMM 可能会导致潜在类数量的过度选择。为了改善这种不理想的结果,已经提出了基于连续分布的 skew t 族的 GMM。该分布族包括正态分布、skew 正态分布、t 分布和 skew t 分布。本模拟研究旨在使用拟合指数和似然比检验确定基于连续分布的 skew t 族的 GMM 选择“真实”潜在群组数量的效率。结果表明,在不同样本量和正态、偏斜和峰态数据下,只有考虑的 skew t GMM 模型显示出拟合指数和 LRT 假阳性率低于 0.05 的截断值。模拟结果得到了实际教育数据应用实例的支持。这些发现支持了基于该分布族的 GMM 的使用的好处和风险的实用指南的发展。