Suppr超能文献

统计学显著性的预测能力。

Predictive power of statistical significance.

作者信息

Heston Thomas F, King Jackson M

机构信息

Department of Family Medicine, University of Washington, Seattle, WA 98195-6340, United States.

Department of Medical Education and Clinical Sciences, Elson S. Floyd College of Medicine, Washington State University, Spokane, WA 99210-1495, United States.

出版信息

World J Methodol. 2017 Dec 26;7(4):112-116. doi: 10.5662/wjm.v7.i4.112.

Abstract

A statistically significant research finding should not be defined as a -value of 0.05 or less, because this definition does not take into account study power. Statistical significance was originally defined by Fisher RA as a -value of 0.05 or less. According to Fisher, any finding that is likely to occur by random variation no more than 1 in 20 times is considered significant. Neyman J and Pearson ES subsequently argued that Fisher's definition was incomplete. They proposed that statistical significance could only be determined by analyzing the chance of incorrectly considering a study finding was significant (a Type I error) or incorrectly considering a study finding was insignificant (a Type II error). Their definition of statistical significance is also incomplete because the error rates are considered separately, not together. A better definition of statistical significance is the positive predictive value of a -value, which is equal to the power divided by the sum of power and the -value. This definition is more complete and relevant than Fisher's or Neyman-Peason's definitions, because it takes into account both concepts of statistical significance. Using this definition, a statistically significant finding requires a -value of 0.05 or less when the power is at least 95%, and a -value of 0.032 or less when the power is 60%. To achieve statistical significance, -values must be adjusted downward as the study power decreases.

摘要

具有统计学意义的研究结果不应被定义为P值等于或小于0.05,因为这个定义没有考虑检验效能。统计学意义最初由费希尔(R.A. Fisher)定义为P值等于或小于0.05。按照费希尔的说法,任何由随机变异导致的结果,其发生概率不超过二十分之一的,都被认为是显著的。内曼(J. Neyman)和皮尔逊(E.S. Pearson)随后指出费希尔的定义并不完整。他们提出,统计学意义只能通过分析错误地认为研究结果具有显著性(I型错误)或错误地认为研究结果不具有显著性(II型错误)的概率来确定。他们对统计学意义的定义也不完整,因为错误率是分开考虑的,而不是综合起来考虑。对统计学意义更好的定义是P值的阳性预测值,它等于检验效能除以检验效能与P值之和。这个定义比费希尔或内曼 - 皮尔逊的定义更完整、更相关,因为它考虑了统计学意义的两个概念。使用这个定义,当检验效能至少为95%时,具有统计学意义的结果要求P值等于或小于0.05;当检验效能为60%时,P值等于或小于0.032。为了达到统计学意义,随着研究检验效能的降低,P值必须向下调整。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/feaf/5746664/b61d508a94e3/WJM-7-112-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验