• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

减少由于非正态性导致的相关系数偏差和误差。

Reducing Bias and Error in the Correlation Coefficient Due to Nonnormality.

作者信息

Bishara Anthony J, Hittner James B

机构信息

College of Charleston, Charleston, SC, USA.

出版信息

Educ Psychol Meas. 2015 Oct;75(5):785-804. doi: 10.1177/0013164414557639. Epub 2014 Nov 11.

DOI:10.1177/0013164414557639
PMID:29795841
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5965513/
Abstract

It is more common for educational and psychological data to be nonnormal than to be approximately normal. This tendency may lead to bias and error in point estimates of the Pearson correlation coefficient. In a series of Monte Carlo simulations, the Pearson correlation was examined under conditions of normal and nonnormal data, and it was compared with its major alternatives, including the Spearman rank-order correlation, the bootstrap estimate, the Box-Cox transformation family, and a general normalizing transformation (i.e., rankit), as well as to various bias adjustments. Nonnormality caused the correlation coefficient to be inflated by up to +.14, particularly when the nonnormality involved heavy-tailed distributions. Traditional bias adjustments worsened this problem, further inflating the estimate. The Spearman and rankit correlations eliminated this inflation and provided conservative estimates. Rankit also minimized random error for most sample sizes, except for the smallest samples ( = 10), where bootstrapping was more effective. Overall, results justify the use of carefully chosen alternatives to the Pearson correlation when normality is violated.

摘要

教育和心理数据呈现非正态分布的情况比近似正态分布更为常见。这种趋势可能会导致皮尔逊相关系数点估计中的偏差和误差。在一系列蒙特卡洛模拟中,研究了正态和非正态数据条件下的皮尔逊相关性,并将其与主要替代方法进行了比较,包括斯皮尔曼等级相关、自助估计、Box-Cox变换族、一般正态化变换(即正态得分)以及各种偏差调整。非正态性会导致相关系数最多膨胀+.14,特别是当非正态性涉及重尾分布时。传统的偏差调整使这个问题更加严重,进一步夸大了估计值。斯皮尔曼相关和正态得分相关消除了这种膨胀并提供了保守估计。除了最小样本量(n = 10)时自助法更有效外,对于大多数样本量,正态得分还使随机误差最小化。总体而言,结果证明当违反正态性时,使用精心选择的皮尔逊相关替代方法是合理的。

相似文献

1
Reducing Bias and Error in the Correlation Coefficient Due to Nonnormality.减少由于非正态性导致的相关系数偏差和误差。
Educ Psychol Meas. 2015 Oct;75(5):785-804. doi: 10.1177/0013164414557639. Epub 2014 Nov 11.
2
Testing the significance of a correlation with nonnormal data: comparison of Pearson, Spearman, transformation, and resampling approaches.用非正态数据检验相关性的显著性:皮尔逊、斯皮尔曼、转换和重抽样方法的比较。
Psychol Methods. 2012 Sep;17(3):399-417. doi: 10.1037/a0028087. Epub 2012 May 7.
3
Confidence intervals for correlations when data are not normal.数据非正态时相关性的置信区间。
Behav Res Methods. 2017 Feb;49(1):294-309. doi: 10.3758/s13428-016-0702-8.
4
A Monte Carlo investigation of the Fisher Z transformation for normal and nonnormal distributions.对正态分布和非正态分布的费舍尔Z变换的蒙特卡罗研究。
Psychol Rep. 2000 Dec;87(3 Pt 2):1101-14. doi: 10.2466/pr0.2000.87.3f.1101.
5
Bootstrap standard error and confidence intervals for the correlation corrected for range restriction: a simulation study.针对范围限制进行校正后的相关性的自助法标准误差和置信区间:一项模拟研究。
Psychol Methods. 2004 Sep;9(3):369-85. doi: 10.1037/1082-989X.9.3.369.
6
The influence of nonnormality from primary studies on the standardized mean difference in meta-analysis.元分析中来自主要研究的非正态性对标准化均数差的影响。
Behav Res Methods. 2020 Aug;52(4):1552-1567. doi: 10.3758/s13428-019-01334-x.
7
The impact of nonnormality on full information maximum-likelihood estimation for structural equation models with missing data.非正态性对具有缺失数据的结构方程模型的全信息极大似然估计的影响。
Psychol Methods. 2001 Dec;6(4):352-70.
8
Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data.比较不同分布和样本大小下的 Pearson 和 Spearman 相关系数:使用模拟和实证数据的教程。
Psychol Methods. 2016 Sep;21(3):273-90. doi: 10.1037/met0000079. Epub 2016 May 23.
9
An Investigation of the Sample Performance of Two Nonnormality Corrections for RMSEA.对RMSEA两种非正态性校正的样本性能研究。
Multivariate Behav Res. 2012 Nov;47(6):904-30. doi: 10.1080/00273171.2012.715252.
10
Effects of Compounded Nonnormality of Residuals in Hierarchical Linear Modeling.分层线性模型中残差复合非正态性的影响
Educ Psychol Meas. 2022 Apr;82(2):330-355. doi: 10.1177/00131644211010234. Epub 2021 May 10.

引用本文的文献

1
Coercive Control and Intimate Partner Violence: Relationship With Personality Disorder Severity and Pathological Narcissism.强制控制与亲密伴侣暴力:与人格障碍严重程度及病理性自恋的关系
Personal Ment Health. 2025 Nov;19(4):e70038. doi: 10.1002/pmh.70038.
2
An explainable AI approach for mapping multivariate regional brain age and clinical severity patterns in Alzheimer's disease.一种用于绘制阿尔茨海默病多元区域脑龄和临床严重程度模式的可解释人工智能方法。
Biol Methods Protoc. 2025 Aug 7;10(1):bpaf051. doi: 10.1093/biomethods/bpaf051. eCollection 2025.
3
Exploring Passion for Opioid Use with a Treatment-Seeking Sample: Results from a Canonical Correlation Analysis.对寻求治疗的样本进行阿片类药物使用热情的探索:典型相关分析结果
Addict Res Theory. 2025 Jun 9. doi: 10.1080/16066359.2025.2509627.
4
Process-based measures in high-stakes testing: practical implications for construct validity within military aviation selection.高风险测试中基于过程的测量:对军事航空选拔中结构效度的实际影响。
Cogn Res Princ Implic. 2025 Aug 20;10(1):51. doi: 10.1186/s41235-025-00660-3.
5
Accounting for differences between Infinium MethylationEPIC v2 and v1 in DNA methylation-based tools.在基于DNA甲基化的工具中考虑Infinium甲基化EPIC v2和v1之间的差异。
Life Sci Alliance. 2025 Jul 8;8(9). doi: 10.26508/lsa.202403155. Print 2025 Sep.
6
Collecting Real-World Data via an In-Home Smart Medication Dispenser: Longitudinal Observational Study of Survey Panel Persistency, Response Rates, and Psychometric Properties.通过家用智能药物分配器收集真实世界数据:关于调查小组持续性、回复率和心理测量特性的纵向观察研究
JMIR Hum Factors. 2025 Feb 3;12:e60438. doi: 10.2196/60438.
7
Speak and You Shall Predict: Evidence That Speech at Initial Cocaine Abstinence Is a Biomarker of Long-Term Drug Use Behavior.开口就能预测:初次戒除可卡因时的言语是长期药物使用行为生物标志物的证据。
Biol Psychiatry. 2025 Jul 1;98(1):65-75. doi: 10.1016/j.biopsych.2025.01.009. Epub 2025 Jan 20.
8
Factor Structure and Validity of Composite Scores Resulting From a Computerized Cognitive Test Battery in Healthy Adults and Patients With Primary Brain Tumors.健康成年人和原发性脑肿瘤患者使用计算机化认知测试组合得出的综合分数的因子结构与效度
Assessment. 2024 Nov 20;32(7):10731911241289987. doi: 10.1177/10731911241289987.
9
Discrepancies in readouts between Infinium MethylationEPIC v2.0 and v1.0 reflected in DNA methylation-based tools: implications and considerations for human population epigenetic studies.基于DNA甲基化的工具所反映的Infinium甲基化EPIC v2.0和v1.0之间读数差异:对人类群体表观遗传学研究的影响与考量
bioRxiv. 2024 Sep 28:2024.07.02.600461. doi: 10.1101/2024.07.02.600461.
10
Inferential procedures based on the weighted Pearson correlation coefficient test statistic.基于加权皮尔逊相关系数检验统计量的推断程序。
J Appl Stat. 2022 Oct 25;51(3):481-496. doi: 10.1080/02664763.2022.2137477. eCollection 2024.

本文引用的文献

1
Simulating Multivariate Nonnormal Data Using an Iterative Algorithm.使用迭代算法模拟多元非正态数据。
Multivariate Behav Res. 2008 Jul-Sep;43(3):355-81. doi: 10.1080/00273170802285693.
2
Testing the significance of a correlation with nonnormal data: comparison of Pearson, Spearman, transformation, and resampling approaches.用非正态数据检验相关性的显著性:皮尔逊、斯皮尔曼、转换和重抽样方法的比较。
Psychol Methods. 2012 Sep;17(3):399-417. doi: 10.1037/a0028087. Epub 2012 May 7.
3
Estimation of the simple correlation coefficient.简单相关系数的估计。
Behav Res Methods. 2010 Nov;42(4):906-17. doi: 10.3758/BRM.42.4.906.
4
Rank-based inverse normal transformations are increasingly used, but are they merited?基于秩的逆正态变换的使用越来越多,但它们值得这样做吗?
Behav Genet. 2009 Sep;39(5):580-95. doi: 10.1007/s10519-009-9281-0. Epub 2009 Jun 14.
5
Bootstrapping to test for nonzero population correlation coefficients using univariate sampling.使用单变量抽样进行自举检验以检测非零总体相关系数。
Psychol Methods. 2007 Dec;12(4):414-433. doi: 10.1037/1082-989X.12.4.414.
6
Confidence intervals for gamma-family measures of ordinal association.序数关联的伽马族测度的置信区间。
Psychol Methods. 2007 Jun;12(2):185-204. doi: 10.1037/1082-989X.12.2.185.
7
Interval estimation for rank correlation coefficients based on the probit transformation with extension to measurement error correction of correlated ranked data.基于概率单位变换的等级相关系数区间估计及其对相关排序数据测量误差校正的扩展
Stat Med. 2007 Feb 10;26(3):633-46. doi: 10.1002/sim.2547.
8
How to fit a response time distribution.如何拟合响应时间分布。
Psychon Bull Rev. 2000 Sep;7(3):424-65. doi: 10.3758/bf03214357.
9
Correlating and predicting psychiatric symptom ratings: Spearman's r versus Kendall's tau correlation.关联和预测精神症状评分:斯皮尔曼相关系数r与肯德尔等级相关系数tau的比较
J Psychiatr Res. 1999 Mar-Apr;33(2):97-104. doi: 10.1016/s0022-3956(98)90046-2.
10
Shapes of reaction-time distributions and shapes of learning curves: a test of the instance theory of automaticity.反应时间分布的形状与学习曲线的形状:自动性实例理论的一项检验
J Exp Psychol Learn Mem Cogn. 1992 Sep;18(5):883-914. doi: 10.1037//0278-7393.18.5.883.