Suppr超能文献

迈向循证医学统计学。1:P值谬误。

Toward evidence-based medical statistics. 1: The P value fallacy.

作者信息

Goodman S N

机构信息

Johns Hopkins University School of Medicine, Baltimore, Maryland, USA.

出版信息

Ann Intern Med. 1999 Jun 15;130(12):995-1004. doi: 10.7326/0003-4819-130-12-199906150-00008.

Abstract

An important problem exists in the interpretation of modern medical research data: Biological understanding and previous research play little formal role in the interpretation of quantitative results. This phenomenon is manifest in the discussion sections of research articles and ultimately can affect the reliability of conclusions. The standard statistical approach has created this situation by promoting the illusion that conclusions can be produced with certain "error rates," without consideration of information from outside the experiment. This statistical approach, the key components of which are P values and hypothesis tests, is widely perceived as a mathematically coherent approach to inference. There is little appreciation in the medical community that the methodology is an amalgam of incompatible elements, whose utility for scientific inference has been the subject of intense debate among statisticians for almost 70 years. This article introduces some of the key elements of that debate and traces the appeal and adverse impact of this methodology to the P value fallacy, the mistaken idea that a single number can capture both the long-run outcomes of an experiment and the evidential meaning of a single result. This argument is made as a prelude to the suggestion that another measure of evidence should be used--the Bayes factor, which properly separates issues of long-run behavior from evidential strength and allows the integration of background knowledge with statistical findings.

摘要

现代医学研究数据的解读存在一个重要问题

生物学理解和先前的研究在定量结果的解读中几乎没有发挥正式作用。这种现象在研究文章的讨论部分很明显,最终可能会影响结论的可靠性。标准的统计方法造成了这种情况,它制造了一种错觉,即可以以特定的“错误率”得出结论,而不考虑实验之外的信息。这种统计方法,其关键组成部分是P值和假设检验,被广泛认为是一种数学上连贯的推理方法。医学界几乎没有意识到,这种方法是不相容元素的混合体,其在科学推理中的效用在统计学家中已经激烈争论了近70年。本文介绍了这场争论的一些关键要素,并追溯了这种方法的吸引力和不利影响,这源于P值谬误,即错误地认为一个单一数字既能捕捉实验的长期结果,又能体现单个结果的证据意义。提出这一观点是为了建议使用另一种证据度量——贝叶斯因子,它能正确地将长期行为问题与证据强度区分开来,并允许将背景知识与统计结果相结合。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验