文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

违反正态性假设可能是两害相权取其轻。

Violating the normality assumption may be the lesser of two evils.

机构信息

Division of Evolutionary Biology, Faculty of Biology, Ludwig Maximilian University of Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany.

Department of Behavioural Ecology and Evolutionary Genetics, Max Planck Institute for Ornithology, 82319, Seewiesen, Germany.

出版信息

Behav Res Methods. 2021 Dec;53(6):2576-2590. doi: 10.3758/s13428-021-01587-5. Epub 2021 May 7.


DOI:10.3758/s13428-021-01587-5
PMID:33963496
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8613103/
Abstract

When data are not normally distributed, researchers are often uncertain whether it is legitimate to use tests that assume Gaussian errors, or whether one has to either model a more specific error structure or use randomization techniques. Here we use Monte Carlo simulations to explore the pros and cons of fitting Gaussian models to non-normal data in terms of risk of type I error, power and utility for parameter estimation. We find that Gaussian models are robust to non-normality over a wide range of conditions, meaning that p values remain fairly reliable except for data with influential outliers judged at strict alpha levels. Gaussian models also performed well in terms of power across all simulated scenarios. Parameter estimates were mostly unbiased and precise except if sample sizes were small or the distribution of the predictor was highly skewed. Transformation of data before analysis is often advisable and visual inspection for outliers and heteroscedasticity is important for assessment. In strong contrast, some non-Gaussian models and randomization techniques bear a range of risks that are often insufficiently known. High rates of false-positive conclusions can arise for instance when overdispersion in count data is not controlled appropriately or when randomization procedures ignore existing non-independencies in the data. Hence, newly developed statistical methods not only bring new opportunities, but they can also pose new threats to reliability. We argue that violating the normality assumption bears risks that are limited and manageable, while several more sophisticated approaches are relatively error prone and particularly difficult to check during peer review. Scientists and reviewers who are not fully aware of the risks might benefit from preferentially trusting Gaussian mixed models in which random effects account for non-independencies in the data.

摘要

当数据不符合正态分布时,研究人员通常不确定是否可以使用假设正态误差的测试,或者是否必须构建更具体的误差结构或使用随机化技术。在这里,我们使用蒙特卡罗模拟来探讨将正态模型拟合到非正态数据的优缺点,包括对第一类错误风险、功效和参数估计的实用性。我们发现,在广泛的条件下,正态模型对非正态性具有很强的鲁棒性,这意味着 p 值除了在严格的 alpha 水平下判断有影响的异常值的数据外,仍然相当可靠。正态模型在所有模拟情况下的功效也表现良好。除了样本量较小或预测变量的分布高度偏斜外,参数估计大多是无偏且精确的。在分析之前对数据进行转换通常是明智的,并且对异常值和异方差的直观检查对于评估很重要。相比之下,一些非正态模型和随机化技术存在一系列风险,这些风险通常知之甚少。例如,当未适当控制计数数据中的过分散或随机化程序忽略数据中现有的非独立性时,可能会出现高假阳性结论的风险。因此,新开发的统计方法不仅带来了新的机遇,而且还可能对可靠性构成新的威胁。我们认为,违反正态性假设带来的风险是有限且可控的,而一些更复杂的方法则相对容易出错,尤其是在同行评审期间难以检查。如果不完全了解风险,科学家和审稿人可能会受益于优先信任可以解释数据中非独立性的正态混合模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/008c2c2d0e69/13428_2021_1587_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/e41502c9be17/13428_2021_1587_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/175667f74aea/13428_2021_1587_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/008c2c2d0e69/13428_2021_1587_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/e41502c9be17/13428_2021_1587_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/175667f74aea/13428_2021_1587_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ef2/8613103/008c2c2d0e69/13428_2021_1587_Fig3_HTML.jpg

相似文献

[1]
Violating the normality assumption may be the lesser of two evils.

Behav Res Methods. 2021-12

[2]
Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives.

Health Technol Assess. 2001

[3]
Not quite normal: Consequences of violating the assumption of normality in regression mixture models.

Struct Equ Modeling. 2012

[4]
Parametric and nonparametric population methods: their comparative performance in analysing a clinical dataset and two Monte Carlo simulation studies.

Clin Pharmacokinet. 2006

[5]
Residual Normality Assumption and the Estimation of Multiple Membership Random Effects Models.

Multivariate Behav Res. 2018-12-6

[6]
More about the basic assumptions of t-test: normality and sample size.

Korean J Anesthesiol. 2019-4-1

[7]
Application of robust regression in translational neuroscience studies with non-Gaussian outcome data.

Front Aging Neurosci. 2024-1-24

[8]
A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design.

BMC Med Res Methodol. 2014-1-24

[9]
Preliminary testing for normality: some statistical aspects of a common concept.

Clin Exp Dermatol. 2006-11

[10]
On the efficacy of procedures to normalize Ex-Gaussian distributions.

Front Psychol. 2015-1-7

引用本文的文献

[1]
Assessing the robustness of normality tests under varying skewness and kurtosis: a practical checklist for public health researchers.

BMC Med Res Methodol. 2025-9-1

[2]
Autistic traits relate to speed/accuracy trade-off but not statistical learning and updating.

Sci Rep. 2025-8-30

[3]
The GLM-spectrum: A multilevel framework for spectrum analysis with covariate and confound modelling.

Imaging Neurosci (Camb). 2024-2-2

[4]
Impaired physical function in relation to later-life exposure to ambient fine particulate matter and ozone among Chinese middle-aged and older adults.

BMC Public Health. 2025-8-2

[5]
Active nudging towards digital well-being: reducing excessive screen time on mobile phones and potential improvement for sleep quality.

Front Psychiatry. 2025-7-17

[6]
Reconstructing History: Scale Analysis Reveals Long-Term Changes in Age-Related Growth of a Coregonid Fish.

Ecol Evol. 2025-7-30

[7]
Altered reactivity to threatening stimuli in models of Parkinson's disease, revealed by a trial-based assay.

Elife. 2025-7-29

[8]
Estimation of Genetic Parameters for Carcass and Meat Quality Traits Using Genomic Information in Yorkshire Pigs.

Animals (Basel). 2025-7-14

[9]
The Value of Individual Screen Response Time in Predicting Student Test Performance: Evidence from TIMSS 2019 Problem Solving and Inquiry Tasks.

J Intell. 2025-7-6

[10]
Network Analysis of Anxiety in Prostate Cancer Patients.

Psychooncology. 2025-7

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索