Suppr超能文献

减少由于非正态性导致的相关系数偏差和误差。

Reducing Bias and Error in the Correlation Coefficient Due to Nonnormality.

作者信息

Bishara Anthony J, Hittner James B

机构信息

College of Charleston, Charleston, SC, USA.

出版信息

Educ Psychol Meas. 2015 Oct;75(5):785-804. doi: 10.1177/0013164414557639. Epub 2014 Nov 11.

Abstract

It is more common for educational and psychological data to be nonnormal than to be approximately normal. This tendency may lead to bias and error in point estimates of the Pearson correlation coefficient. In a series of Monte Carlo simulations, the Pearson correlation was examined under conditions of normal and nonnormal data, and it was compared with its major alternatives, including the Spearman rank-order correlation, the bootstrap estimate, the Box-Cox transformation family, and a general normalizing transformation (i.e., rankit), as well as to various bias adjustments. Nonnormality caused the correlation coefficient to be inflated by up to +.14, particularly when the nonnormality involved heavy-tailed distributions. Traditional bias adjustments worsened this problem, further inflating the estimate. The Spearman and rankit correlations eliminated this inflation and provided conservative estimates. Rankit also minimized random error for most sample sizes, except for the smallest samples ( = 10), where bootstrapping was more effective. Overall, results justify the use of carefully chosen alternatives to the Pearson correlation when normality is violated.

摘要

教育和心理数据呈现非正态分布的情况比近似正态分布更为常见。这种趋势可能会导致皮尔逊相关系数点估计中的偏差和误差。在一系列蒙特卡洛模拟中,研究了正态和非正态数据条件下的皮尔逊相关性,并将其与主要替代方法进行了比较,包括斯皮尔曼等级相关、自助估计、Box-Cox变换族、一般正态化变换(即正态得分)以及各种偏差调整。非正态性会导致相关系数最多膨胀+.14,特别是当非正态性涉及重尾分布时。传统的偏差调整使这个问题更加严重,进一步夸大了估计值。斯皮尔曼相关和正态得分相关消除了这种膨胀并提供了保守估计。除了最小样本量(n = 10)时自助法更有效外,对于大多数样本量,正态得分还使随机误差最小化。总体而言,结果证明当违反正态性时,使用精心选择的皮尔逊相关替代方法是合理的。

相似文献

1
3
Confidence intervals for correlations when data are not normal.数据非正态时相关性的置信区间。
Behav Res Methods. 2017 Feb;49(1):294-309. doi: 10.3758/s13428-016-0702-8.
10
Effects of Compounded Nonnormality of Residuals in Hierarchical Linear Modeling.分层线性模型中残差复合非正态性的影响
Educ Psychol Meas. 2022 Apr;82(2):330-355. doi: 10.1177/00131644211010234. Epub 2021 May 10.

引用本文的文献

本文引用的文献

1
Simulating Multivariate Nonnormal Data Using an Iterative Algorithm.使用迭代算法模拟多元非正态数据。
Multivariate Behav Res. 2008 Jul-Sep;43(3):355-81. doi: 10.1080/00273170802285693.
3
Estimation of the simple correlation coefficient.简单相关系数的估计。
Behav Res Methods. 2010 Nov;42(4):906-17. doi: 10.3758/BRM.42.4.906.
8
How to fit a response time distribution.如何拟合响应时间分布。
Psychon Bull Rev. 2000 Sep;7(3):424-65. doi: 10.3758/bf03214357.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验