Carlier Chiara, Karch Julian D, Kuppens Peter, Ceulemans Eva
Department of Psychology and Educational Sciences, KU Leuven, Belgium.
Department of Methodology and Statistics, Institute of Psychology, Leiden University, The Netherlands.
Psychol Belg. 2024 Jun 25;64(1):72-84. doi: 10.5334/pb.1297. eCollection 2024.
Profile similarity measures are used to quantify the similarity of two sets of ratings on multiple variables. Yet, it remains unclear how different measures are distinct or overlap and what type of information they precisely convey, making it unclear what measures are best applied under varying circumstances. With this study, we aim to provide clarity with respect to how existing measures interrelate and provide recommendations for their use by comparing a wide range of profile similarity measures. We have taken four steps. First, we reviewed 88 similarity measures by applying them to multiple cross-sectional and intensive longitudinal data sets on emotional experience and retained 43 useful profile similarity measures after eliminating duplicates, complements, or measures that were unsuitable for the intended purpose. Second, we have clustered these 43 measures into similarly behaving groups, and found three general clusters: one cluster with difference measures, one cluster with product measures that could be split into four more nuanced groups and one miscellaneous cluster that could be split into two more nuanced groups. Third, we have interpreted what unifies these groups and their subgroups and what information they convey based on theory and formulas. Last, based on our findings, we discuss recommendations with respect to the choice of measure, propose to avoid using the Pearson correlation, and suggest to center profile items when stereotypical patterns threaten to confound the computation of similarity.
轮廓相似性度量用于量化两组在多个变量上的评分的相似性。然而,尚不清楚不同的度量是如何不同或重叠的,以及它们确切传达的信息类型是什么,这使得在不同情况下哪种度量最适用并不明确。通过本研究,我们旨在通过比较多种轮廓相似性度量来明确现有度量之间的相互关系,并为它们的使用提供建议。我们采取了四个步骤。首先,我们将88种相似性度量应用于多个关于情感体验的横断面和密集纵向数据集,并在消除重复、互补或不适用于预期目的的度量后,保留了43种有用的轮廓相似性度量。其次,我们将这43种度量聚类为行为相似的组,发现了三个一般聚类:一个差异度量聚类、一个乘积度量聚类(可进一步细分为四个更细致的组)和一个杂项聚类(可进一步细分为两个更细致的组)。第三,我们根据理论和公式解释了统一这些组及其子组的因素以及它们传达的信息。最后,基于我们的发现,我们讨论了关于度量选择的建议,提议避免使用皮尔逊相关性,并建议在刻板模式可能混淆相似性计算时对轮廓项目进行中心化处理。