Suppr超能文献

排列检验在0.5%和5%的显著性水平下既稳健又有效。

Permutation tests are robust and powerful at 0.5% and 5% significance levels.

作者信息

Noguchi Kimihiro, Konietschke Frank, Marmolejo-Ramos Fernando, Pauly Markus

机构信息

Department of Mathematics, Western Washington University, Bellingham, WA, 98225, USA.

Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Biometry and Clinical Epidemiology, Charitéplatz 1, Berlin, 10117, Germany.

出版信息

Behav Res Methods. 2021 Dec;53(6):2712-2724. doi: 10.3758/s13428-021-01595-5. Epub 2021 May 28.

Abstract

Recent replication crisis has led to a number of ad hoc suggestions to decrease the chance of making false positive findings. Among them, Johnson (Proceedings of the National Academy of Sciences, 110, 19313-19317, 2013) and Benjamin et al. (Nature Human Behaviour, 2, 6-10 2018) recommend using the significance level of α = 0.005 (0.5%) as opposed to the conventional 0.05 (5%) level. Even though their suggestion is easy to implement, it is unclear whether or not the commonly used statistical tests are robust and/or powerful at such a small significance level. Therefore, the main aim of our study is to investigate the robustness and power curve behaviors of independent (unpaired) two-sample tests for metric and ordinal data at nominal significance levels of α = 0.005 and α = 0.05. Through an extensive simulation study, it is found that the permutation versions of the Welch t-test and the Brunner-Munzel test are particularly robust and powerful while the commonly used two-sample tests which utilize t-distribution tend to be either liberal or conservative, and have peculiar power curve behaviors under skewed distributions with variance heterogeneity.

摘要

最近的复制危机引发了一些临时建议,以降低得出假阳性结果的可能性。其中,约翰逊(《美国国家科学院院刊》,110卷,19313 - 19317页,2013年)以及本杰明等人(《自然·人类行为》,第2卷,6 - 10页,2018年)建议使用α = 0.005(0.5%)的显著性水平,而非传统的0.05(5%)水平。尽管他们的建议易于实施,但尚不清楚常用的统计检验在如此小的显著性水平下是否稳健且/或有效。因此,我们研究的主要目的是在α = 0.005和α = 0.05的名义显著性水平下,研究度量和有序数据的独立(非配对)双样本检验的稳健性和功效曲线行为。通过广泛的模拟研究发现,韦尔奇t检验和布鲁纳 - 蒙泽尔检验的排列版本特别稳健且有效,而常用的利用t分布的双样本检验往往要么宽松要么保守,并且在具有方差不齐性的偏态分布下具有特殊的功效曲线行为。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验