Department of Data Science and Analytics, BI Norwegian Business School, 0484, Oslo, Norway.
Department of Economics, BI Norwegian Business School, 0484, Oslo, Norway.
Psychometrika. 2023 Mar;88(1):241-252. doi: 10.1007/s11336-022-09898-y. Epub 2023 Jan 31.
The polychoric correlation is a popular measure of association for ordinal data. It estimates a latent correlation, i.e., the correlation of a latent vector. This vector is assumed to be bivariate normal, an assumption that cannot always be justified. When bivariate normality does not hold, the polychoric correlation will not necessarily approximate the true latent correlation, even when the observed variables have many categories. We calculate the sets of possible values of the latent correlation when latent bivariate normality is not necessarily true, but at least the latent marginals are known. The resulting sets are called partial identification sets, and are shown to shrink to the true latent correlation as the number of categories increase. Moreover, we investigate partial identification under the additional assumption that the latent copula is symmetric, and calculate the partial identification set when one variable is ordinal and another is continuous. We show that little can be said about latent correlations, unless we have impractically many categories or we know a great deal about the distribution of the latent vector. An open-source R package is available for applying our results.
多元等级相关系数是一种常用的等级数据关联度量方法。它估计了一个潜在的相关系数,即潜在向量的相关性。这个向量被假设为双变量正态分布,但是这个假设并不总是合理的。当双变量正态性不成立时,多元等级相关系数不一定能近似真实的潜在相关系数,即使观测变量有很多类别。当潜在的双变量正态性不一定成立,但至少潜在的边缘分布已知时,我们计算潜在相关系数的可能值的集合。由此得到的集合称为部分识别集合,随着类别的数量增加,它们会收缩到真实的潜在相关系数。此外,我们还在潜在 Copula 是对称的额外假设下进行了部分识别研究,并计算了当一个变量是有序的,另一个变量是连续的情况下的部分识别集合。我们表明,除非我们有不切实际的大量类别,或者我们对潜在向量的分布有很多了解,否则我们几乎无法对潜在相关系数进行推断。我们提供了一个开源的 R 包,可用于应用我们的结果。