Department of Economics, BI Norwegian Business School, Oslo, 0484, Norway.
Department of Mathematics, University of Oslo, PB 1053, Blindern, NO-0316, Oslo, Norway.
Psychometrika. 2020 Dec;85(4):1028-1051. doi: 10.1007/s11336-020-09737-y. Epub 2020 Dec 21.
The tetrachoric correlation is a popular measure of association for binary data and estimates the correlation of an underlying normal latent vector. However, when the underlying vector is not normal, the tetrachoric correlation will be different from the underlying correlation. Since assuming underlying normality is often done on pragmatic and not substantial grounds, the estimated tetrachoric correlation may therefore be quite different from the true underlying correlation that is modeled in structural equation modeling. This motivates studying the range of latent correlations that are compatible with given binary data, when the distribution of the latent vector is partly or completely unknown. We show that nothing can be said about the latent correlations unless we know more than what can be derived from the data. We identify an interval constituting all latent correlations compatible with observed data when the marginals of the latent variables are known. Also, we quantify how partial knowledge of the dependence structure of the latent variables affect the range of compatible latent correlations. Implications for tests of underlying normality are briefly discussed.
四分相关系数是一种常用的二值数据关联度量方法,用于估计潜在正态潜变量的相关性。然而,当下潜向量不是正态分布时,四分相关系数将与潜在相关系数不同。由于通常基于实际而非实质性理由假设潜在正态性,因此估计的四分相关系数可能与在结构方程建模中建模的真实潜在相关系数有很大差异。这促使我们研究在部分或完全未知潜在向量分布的情况下,与给定二值数据兼容的潜在相关系数的范围。我们表明,除非我们知道比从数据中推导出的更多信息,否则我们不能对潜在相关系数说些什么。当已知潜在变量的边缘分布时,我们确定了构成与观测数据兼容的所有潜在相关系数的区间。此外,我们量化了对潜在变量的依赖结构的部分了解如何影响兼容的潜在相关系数的范围。还简要讨论了对潜在正态性检验的影响。