• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

违反正态性假设可能是两害相权取其轻。

Violating the normality assumption may be the lesser of two evils.

机构信息

Division of Evolutionary Biology, Faculty of Biology, Ludwig Maximilian University of Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany.

Department of Behavioural Ecology and Evolutionary Genetics, Max Planck Institute for Ornithology, 82319, Seewiesen, Germany.

出版信息

Behav Res Methods. 2021 Dec;53(6):2576-2590. doi: 10.3758/s13428-021-01587-5. Epub 2021 May 7.

DOI:10.3758/s13428-021-01587-5
PMID:33963496
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8613103/
Abstract

When data are not normally distributed, researchers are often uncertain whether it is legitimate to use tests that assume Gaussian errors, or whether one has to either model a more specific error structure or use randomization techniques. Here we use Monte Carlo simulations to explore the pros and cons of fitting Gaussian models to non-normal data in terms of risk of type I error, power and utility for parameter estimation. We find that Gaussian models are robust to non-normality over a wide range of conditions, meaning that p values remain fairly reliable except for data with influential outliers judged at strict alpha levels. Gaussian models also performed well in terms of power across all simulated scenarios. Parameter estimates were mostly unbiased and precise except if sample sizes were small or the distribution of the predictor was highly skewed. Transformation of data before analysis is often advisable and visual inspection for outliers and heteroscedasticity is important for assessment. In strong contrast, some non-Gaussian models and randomization techniques bear a range of risks that are often insufficiently known. High rates of false-positive conclusions can arise for instance when overdispersion in count data is not controlled appropriately or when randomization procedures ignore existing non-independencies in the data. Hence, newly developed statistical methods not only bring new opportunities, but they can also pose new threats to reliability. We argue that violating the normality assumption bears risks that are limited and manageable, while several more sophisticated approaches are relatively error prone and particularly difficult to check during peer review. Scientists and reviewers who are not fully aware of the risks might benefit from preferentially trusting Gaussian mixed models in which random effects account for non-independencies in the data.

摘要

当数据不符合正态分布时,研究人员通常不确定是否可以使用假设正态误差的测试,或者是否必须构建更具体的误差结构或使用随机化技术。在这里,我们使用蒙特卡罗模拟来探讨将正态模型拟合到非正态数据的优缺点,包括对第一类错误风险、功效和参数估计的实用性。我们发现,在广泛的条件下,正态模型对非正态性具有很强的鲁棒性,这意味着 p 值除了在严格的 alpha 水平下判断有影响的异常值的数据外,仍然相当可靠。正态模型在所有模拟情况下的功效也表现良好。除了样本量较小或预测变量的分布高度偏斜外,参数估计大多是无偏且精确的。在分析之前对数据进行转换通常是明智的,并且对异常值和异方差的直观检查对于评估很重要。相比之下,一些非正态模型和随机化技术存在一系列风险,这些风险通常知之甚少。例如,当未适当控制计数数据中的过分散或随机化程序忽略数据中现有的非独立性时,可能会出现高假阳性结论的风险。因此,新开发的统计方法不仅带来了新的机遇,而且还可能对可靠性构成新的威胁。我们认为,违反正态性假设带来的风险是有限且可控的,而一些更复杂的方法则相对容易出错,尤其是在同行评审期间难以检查。如果不完全了解风险,科学家和审稿人可能会受益于优先信任可以解释数据中非独立性的正态混合模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/008c2c2d0e69/13428_2021_1587_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/e41502c9be17/13428_2021_1587_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/175667f74aea/13428_2021_1587_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/008c2c2d0e69/13428_2021_1587_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/e41502c9be17/13428_2021_1587_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/175667f74aea/13428_2021_1587_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/008c2c2d0e69/13428_2021_1587_Fig3_HTML.jpg

相似文献

1
Violating the normality assumption may be the lesser of two evils.违反正态性假设可能是两害相权取其轻。
Behav Res Methods. 2021 Dec;53(6):2576-2590. doi: 10.3758/s13428-021-01587-5. Epub 2021 May 7.
2
Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives.随机对照试验中的亚组分析:量化假阳性和假阴性风险
Health Technol Assess. 2001;5(33):1-56. doi: 10.3310/hta5330.
3
Not quite normal: Consequences of violating the assumption of normality in regression mixture models.不太正常:回归混合模型中违反正态性假设的后果。
Struct Equ Modeling. 2012;19(2):227-249. doi: 10.1080/10705511.2012.659622. Epub 2012 May 17.
4
Parametric and nonparametric population methods: their comparative performance in analysing a clinical dataset and two Monte Carlo simulation studies.参数和非参数总体方法:它们在分析临床数据集和两项蒙特卡罗模拟研究中的比较性能。
Clin Pharmacokinet. 2006;45(4):365-83. doi: 10.2165/00003088-200645040-00003.
5
Residual Normality Assumption and the Estimation of Multiple Membership Random Effects Models.残差正态性假设与多重成员随机效应模型的估计。
Multivariate Behav Res. 2018 Nov-Dec;53(6):898-913. doi: 10.1080/00273171.2018.1533445. Epub 2018 Dec 6.
6
More about the basic assumptions of t-test: normality and sample size.更详细地探讨 t 检验的基本假设:正态性和样本量。
Korean J Anesthesiol. 2019 Aug;72(4):331-335. doi: 10.4097/kja.d.18.00292. Epub 2019 Apr 1.
7
Application of robust regression in translational neuroscience studies with non-Gaussian outcome data.稳健回归在具有非高斯结果数据的转化神经科学研究中的应用。
Front Aging Neurosci. 2024 Jan 24;15:1299451. doi: 10.3389/fnagi.2023.1299451. eCollection 2023.
8
A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design.一项比较线性回归、贝塔回归、变分散贝塔回归和分数对数回归在两样本设计中恢复平均差异度量的蒙特卡罗模拟研究。
BMC Med Res Methodol. 2014 Jan 24;14:14. doi: 10.1186/1471-2288-14-14.
9
Preliminary testing for normality: some statistical aspects of a common concept.正态性的初步检验:一个常见概念的一些统计学方面
Clin Exp Dermatol. 2006 Nov;31(6):757-61. doi: 10.1111/j.1365-2230.2006.02206.x.
10
On the efficacy of procedures to normalize Ex-Gaussian distributions.关于使前高斯分布标准化的程序的功效
Front Psychol. 2015 Jan 7;5:1548. doi: 10.3389/fpsyg.2014.01548. eCollection 2014.

引用本文的文献

1
Assessing the robustness of normality tests under varying skewness and kurtosis: a practical checklist for public health researchers.评估不同偏度和峰度下正态性检验的稳健性:为公共卫生研究人员提供的实用清单。
BMC Med Res Methodol. 2025 Sep 1;25(1):206. doi: 10.1186/s12874-025-02641-y.
2
Autistic traits relate to speed/accuracy trade-off but not statistical learning and updating.自闭症特质与速度/准确性权衡有关,但与统计学习和更新无关。
Sci Rep. 2025 Aug 30;15(1):32001. doi: 10.1038/s41598-025-16138-7.
3
The GLM-spectrum: A multilevel framework for spectrum analysis with covariate and confound modelling.
广义线性模型频谱:一种用于频谱分析的多层次框架,具有协变量和混杂因素建模。
Imaging Neurosci (Camb). 2024 Feb 2;2. doi: 10.1162/imag_a_00082. eCollection 2024.
4
Impaired physical function in relation to later-life exposure to ambient fine particulate matter and ozone among Chinese middle-aged and older adults.中国中老年人晚年暴露于环境细颗粒物和臭氧与身体功能受损的关系。
BMC Public Health. 2025 Aug 2;25(1):2616. doi: 10.1186/s12889-025-23885-9.
5
Active nudging towards digital well-being: reducing excessive screen time on mobile phones and potential improvement for sleep quality.积极推动数字健康:减少手机过度使用时间并可能改善睡眠质量。
Front Psychiatry. 2025 Jul 17;16:1602997. doi: 10.3389/fpsyt.2025.1602997. eCollection 2025.
6
Reconstructing History: Scale Analysis Reveals Long-Term Changes in Age-Related Growth of a Coregonid Fish.重构历史:尺度分析揭示了一种白鲑科鱼类与年龄相关生长的长期变化。
Ecol Evol. 2025 Jul 30;15(8):e71884. doi: 10.1002/ece3.71884. eCollection 2025 Aug.
7
Altered reactivity to threatening stimuli in models of Parkinson's disease, revealed by a trial-based assay.基于试验的检测方法揭示帕金森病模型中对威胁性刺激的反应性改变。
Elife. 2025 Jul 29;13:RP90905. doi: 10.7554/eLife.90905.
8
Estimation of Genetic Parameters for Carcass and Meat Quality Traits Using Genomic Information in Yorkshire Pigs.利用基因组信息估计大白猪胴体和肉质性状的遗传参数
Animals (Basel). 2025 Jul 14;15(14):2075. doi: 10.3390/ani15142075.
9
The Value of Individual Screen Response Time in Predicting Student Test Performance: Evidence from TIMSS 2019 Problem Solving and Inquiry Tasks.个体屏幕响应时间在预测学生考试成绩方面的价值:来自2019年国际数学和科学趋势研究(TIMSS)问题解决与探究任务的证据
J Intell. 2025 Jul 6;13(7):82. doi: 10.3390/jintelligence13070082.
10
Network Analysis of Anxiety in Prostate Cancer Patients.前列腺癌患者焦虑的网络分析
Psychooncology. 2025 Jul;34(7):e70237. doi: 10.1002/pon.70237.