Bishara Anthony J, Hittner James B
Department of Psychology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA.
Behav Res Methods. 2017 Feb;49(1):294-309. doi: 10.3758/s13428-016-0702-8.
With nonnormal data, the typical confidence interval of the correlation (Fisher z') may be inaccurate. The literature has been unclear as to which of several alternative methods should be used instead, and how extreme a violation of normality is needed to justify an alternative. Through Monte Carlo simulation, 11 confidence interval methods were compared, including Fisher z', two Spearman rank-order methods, the Box-Cox transformation, rank-based inverse normal (RIN) transformation, and various bootstrap methods. Nonnormality often distorted the Fisher z' confidence interval-for example, leading to a 95 % confidence interval that had actual coverage as low as 68 %. Increasing the sample size sometimes worsened this problem. Inaccurate Fisher z' intervals could be predicted by a sample kurtosis of at least 2, an absolute sample skewness of at least 1, or significant violations of normality hypothesis tests. Only the Spearman rank-order and RIN transformation methods were universally robust to nonnormality. Among the bootstrap methods, an observed imposed bootstrap came closest to accurate coverage, though it often resulted in an overly long interval. The results suggest that sample nonnormality can justify avoidance of the Fisher z' interval in favor of a more robust alternative. R code for the relevant methods is provided in supplementary materials.
对于非正态数据,相关性的典型置信区间(Fisher z')可能不准确。目前尚不清楚在几种替代方法中应使用哪种方法,以及需要多大程度的正态性违背才能证明应采用替代方法。通过蒙特卡罗模拟,对11种置信区间方法进行了比较,包括Fisher z'、两种斯皮尔曼等级相关方法、Box-Cox变换、基于秩的逆正态(RIN)变换以及各种自助法。非正态性常常会扭曲Fisher z'置信区间——例如,导致实际覆盖率低至68%的95%置信区间。增加样本量有时会使这个问题恶化。样本峰度至少为2、绝对样本偏度至少为1或正态性假设检验存在显著违背时,可以预测Fisher z'区间不准确。只有斯皮尔曼等级相关和RIN变换方法对非正态性具有普遍的稳健性。在自助法中,观察施加自助法最接近准确覆盖率,不过它常常会导致区间过长。结果表明,样本非正态性可以证明应避免使用Fisher z'区间,而采用更稳健的替代方法。补充材料中提供了相关方法的R代码。