KU Leuven, Leuven, Belgium.
Behav Res Methods. 2013 Sep;45(3):782-91. doi: 10.3758/s13428-012-0293-y.
Mixture analysis is commonly used for clustering objects on the basis of multivariate data. When the data contain a large number of variables, regular mixture analysis may become problematic, because a large number of parameters need to be estimated for each cluster. To tackle this problem, the mixtures-of-factor-analyzers (MFA) model was proposed, which combines clustering with exploratory factor analysis. MFA model selection is rather intricate, as both the number of clusters and the number of underlying factors have to be determined. To this end, the Akaike (AIC) and Bayesian (BIC) information criteria are often used. AIC and BIC try to identify a model that optimally balances model fit and model complexity. In this article, the CHull (Ceulemans & Kiers, 2006) method, which also balances model fit and complexity, is presented as an interesting alternative model selection strategy for MFA. In an extensive simulation study, the performances of AIC, BIC, and CHull were compared. AIC performs poorly and systematically selects overly complex models, whereas BIC performs slightly better than CHull when considering the best model only. However, when taking model selection uncertainty into account by looking at the first three models retained, CHull outperforms BIC. This especially holds in more complex, and thus more realistic, situations (e.g., more clusters, factors, noise in the data, and overlap among clusters).
混合分析通常用于根据多元数据对对象进行聚类。当数据包含大量变量时,常规的混合分析可能会变得有问题,因为需要为每个聚类估计大量参数。为了解决这个问题,提出了混合因子分析器(MFA)模型,该模型将聚类与探索性因子分析相结合。MFA 模型选择相当复杂,因为必须确定聚类的数量和潜在因子的数量。为此,通常使用赤池信息量准则(AIC)和贝叶斯信息量准则(BIC)。AIC 和 BIC 试图识别一种能够最佳平衡模型拟合度和模型复杂度的模型。在本文中,介绍了 CHull(Ceulemans 和 Kiers,2006)方法,该方法也平衡了模型拟合度和复杂度,是 MFA 的一种有趣的替代模型选择策略。在广泛的模拟研究中,比较了 AIC、BIC 和 CHull 的性能。AIC 表现不佳,系统地选择了过于复杂的模型,而 BIC 仅考虑最佳模型时,表现略优于 CHull。然而,当通过查看保留的前三个模型来考虑模型选择不确定性时,CHull 优于 BIC。在更复杂的情况下(例如,更多的聚类、因子、数据中的噪声以及聚类之间的重叠),这一点尤其成立。