Crouch Daniel J M
Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom.
Proc Natl Acad Sci U S A. 2025 Jan 14;122(2):e2415706122. doi: 10.1073/pnas.2415706122. Epub 2025 Jan 10.
A value is conventionally interpreted either as a) the probability by chance of obtaining more extreme results than those observed or b) a tool for declaring significance at a prespecified level. Both approaches carry difficulties: b) does not allow users to make inferences based on the data in hand, and is not rigorously followed by researchers in practice, while (a) is not meaningful as an error rate. Although values retain an important role, these shortcomings are likely to have contributed significantly to the scientific reproducibility crisis. We introduce the concept of defining long-run frequentist error rates given the observed data, allowing researchers to make accurate and intuitive inferences about the probability of making an error after proposing that the null hypothesis is false. As one approach, we define the false evidence rate (FER) as the probability, under the null hypothesis, of observing a hypothetical future value providing evidence toward the alternative hypothesis suggested by the observed value, which we define as a false positive. FERs are much more conservative than their corresponding values, consistent with studies demonstrating that the latter do not effectively control error rates across the scientific literature. To obtain an FER below 5%, one needs to obtain a value below approximately [Formula: see text], while a value of 5% corresponds to an FER of about 25%.
传统上,P值有两种解释:a)偶然获得比观察到的结果更极端结果的概率;b)在预先设定的水平上声明显著性的工具。这两种方法都存在困难:b)不允许用户根据手头的数据进行推断,而且研究人员在实践中也没有严格遵循,而(a)作为错误率没有意义。尽管P值仍然发挥着重要作用,但这些缺点可能在很大程度上导致了科学可重复性危机。我们引入了根据观察到的数据定义长期频率主义错误率的概念,使研究人员能够在提出原假设为假之后,对犯错误的概率做出准确而直观的推断。作为一种方法,我们将错误证据率(FER)定义为在原假设下,观察到一个假设的未来P值为支持由观察到的P值所暗示的备择假设提供证据的概率,我们将其定义为假阳性。FER比相应的P值更为保守,这与研究结果一致,即后者在整个科学文献中没有有效地控制错误率。要使FER低于5%,需要获得一个低于约[公式:见正文]的P值,而5%的P值对应的FER约为25%。