• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

关于使用接收者操作特征曲线分析来确定最合适的 p 值显著性阈值。

On the use of receiver operating characteristic curve analysis to determine the most appropriate p value significance threshold.

机构信息

Global Virus Network, Middle East Region of Global Virus Network (GVN), Shiraz, Iran.

出版信息

J Transl Med. 2024 Jan 4;22(1):16. doi: 10.1186/s12967-023-04827-8.

DOI:10.1186/s12967-023-04827-8
PMID:38178182
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10765856/
Abstract

BACKGROUND

p value is the most common statistic reported in scientific research articles. Choosing the conventional threshold of 0.05 commonly used for the p value in research articles, is unfounded. Many researchers have tried to provide a reasonable threshold for the p value; some proposed a lower threshold, eg, 0.005. However, none of the proposals has gained universal acceptance. Using the analogy between the diagnostic tests with continuous results and statistical inference tests of hypothesis, I wish to present a method to calculate the most appropriate p value significance threshold using the receiver operating characteristic curve (ROC) analysis.

RESULTS

As with diagnostic tests where the most appropriate cut-off values are different depending on the situation, there is no unique cut-off for the p significance threshold. Unlike the previous proposals, which mostly suggest lowering the threshold to a fixed value (eg, from 0.05 to 0.005), the most appropriate p significance threshold proposed here, in most instances, is much less than the conventional cut-off of 0.05 and varies from study to study and from statistical test to test, even within a single study. The proposed method provides the minimum weighted sum of type I and type II errors.

CONCLUSIONS

Given the perplexity involved in using the frequentist statistics in a correct way (dealing with different p significance thresholds, even in a single study), it seems that the p value is no longer a proper statistic to be used in our research; it should be replaced by alternative methods, eg, Bayesian methods.

摘要

背景

p 值是科学研究文章中最常报告的统计数据。选择传统的 0.05 作为研究文章中 p 值的常用阈值是没有依据的。许多研究人员试图为 p 值提供一个合理的阈值;一些人提出了较低的阈值,例如 0.005。然而,这些提议都没有得到普遍认可。通过将具有连续结果的诊断测试与假设统计推断测试进行类比,我希望提出一种使用受试者工作特征曲线(ROC)分析计算最合适的 p 值显著性阈值的方法。

结果

与诊断测试一样,由于最适合的截断值因情况而异,因此 p 值显著性阈值没有唯一的截断值。与之前的提议大多建议将阈值降低到固定值(例如,从 0.05 降低到 0.005)不同,这里提出的最合适的 p 值显著性阈值在大多数情况下远低于传统的 0.05 截断值,并且因研究和统计测试而异,甚至在单个研究中也是如此。所提出的方法提供了最小的 I 型和 II 型错误加权和。

结论

鉴于在正确使用频率统计数据时存在的困惑(处理不同的 p 值显著性阈值,甚至在单个研究中),似乎 p 值不再是我们研究中合适的统计数据;它应该被替代方法(例如贝叶斯方法)所取代。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/2bc6447c5f47/12967_2023_4827_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/074960eaa6fc/12967_2023_4827_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/d6010b252516/12967_2023_4827_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/4c23fcedef31/12967_2023_4827_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/02db9fbcf25f/12967_2023_4827_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/67cbed3f75c3/12967_2023_4827_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/965df75bd747/12967_2023_4827_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/2bc6447c5f47/12967_2023_4827_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/074960eaa6fc/12967_2023_4827_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/d6010b252516/12967_2023_4827_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/4c23fcedef31/12967_2023_4827_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/02db9fbcf25f/12967_2023_4827_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/67cbed3f75c3/12967_2023_4827_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/965df75bd747/12967_2023_4827_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f885/10765856/2bc6447c5f47/12967_2023_4827_Fig7_HTML.jpg

相似文献

1
On the use of receiver operating characteristic curve analysis to determine the most appropriate p value significance threshold.关于使用接收者操作特征曲线分析来确定最合适的 p 值显著性阈值。
J Transl Med. 2024 Jan 4;22(1):16. doi: 10.1186/s12967-023-04827-8.
2
On determining the most appropriate test cut-off value: the case of tests with continuous results.关于确定最合适的检测临界值:连续结果检测的情况
Biochem Med (Zagreb). 2016 Oct 15;26(3):297-307. doi: 10.11613/BM.2016.034.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Reinterpretation of the results of randomized clinical trials.随机临床试验结果的再解读。
PLoS One. 2024 Jun 14;19(6):e0305575. doi: 10.1371/journal.pone.0305575. eCollection 2024.
5
A unified Bayesian framework for exact inference of area under the receiver operating characteristic curve.一种用于精确推断受试者工作特征曲线下面积的统一贝叶斯框架。
Stat Methods Med Res. 2021 Oct;30(10):2269-2287. doi: 10.1177/09622802211037070. Epub 2021 Sep 1.
6
Receiver operating characteristic curve: overview and practical use for clinicians.受试者工作特征曲线:概述与临床医师的实际应用
Korean J Anesthesiol. 2022 Feb;75(1):25-36. doi: 10.4097/kja.21209. Epub 2022 Jan 18.
7
A Bayesian argument against rigid cut-offs in electrodiagnosis of median neuropathy at the wrist.一项反对在腕部正中神经病变电诊断中采用严格临界值的贝叶斯论证。
Neurology. 2003 Feb 11;60(3):458-64. doi: 10.1212/wnl.60.3.458.
8
The predictive receiver operating characteristic curve for the joint assessment of the positive and negative predictive values.用于联合评估阳性预测值和阴性预测值的预测性受试者工作特征曲线。
Philos Trans A Math Phys Eng Sci. 2008 Jul 13;366(1874):2313-33. doi: 10.1098/rsta.2008.0043.
9
Bayesian bootstrap estimation of ROC curve.受试者工作特征曲线的贝叶斯自助法估计
Stat Med. 2008 Nov 20;27(26):5407-20. doi: 10.1002/sim.3366.
10
Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests.连续诊断试验的平滑非参数接收者操作特征(ROC)曲线。
Stat Med. 1997 Oct 15;16(19):2143-56. doi: 10.1002/(sici)1097-0258(19971015)16:19<2143::aid-sim655>3.0.co;2-3.

引用本文的文献

1
On the effect of flexible adjustment of the p value significance threshold on the reproducibility of randomized clinical trials.关于灵活调整P值显著性阈值对随机临床试验可重复性的影响。
PLoS One. 2025 Jun 13;20(6):e0325920. doi: 10.1371/journal.pone.0325920. eCollection 2025.
2
Evaluating Biomarkers of Bone Health After an 8-Week Walking Program in Non-Ambulatory Stroke Survivors: A Pilot Study.评估非卧床脑卒中幸存者进行8周步行计划后的骨骼健康生物标志物:一项试点研究。
J Clin Med. 2024 Oct 28;13(21):6453. doi: 10.3390/jcm13216453.
3
Reinterpretation of the results of randomized clinical trials.

本文引用的文献

1
The roles, challenges, and merits of the p value.P值的作用、挑战及优点。
Patterns (N Y). 2023 Dec 8;4(12):100878. doi: 10.1016/j.patter.2023.100878.
2
A ROC-based test for evaluating the group difference with an application to neonatal audiology screening.基于 ROC 的测试,用于评估组间差异及其在新生儿听力筛查中的应用。
Stat Med. 2021 Sep 20;40(21):4597-4608. doi: 10.1002/sim.9082. Epub 2021 May 29.
3
On the information hidden in a classifier distribution.关于分类器分布中的信息隐藏。
随机临床试验结果的再解读。
PLoS One. 2024 Jun 14;19(6):e0305575. doi: 10.1371/journal.pone.0305575. eCollection 2024.
4
Credibility of the Value.价值的可信度。
J Korean Med Sci. 2024 Jun 3;39(21):e177. doi: 10.3346/jkms.2024.39.e177.
Sci Rep. 2021 Jan 13;11(1):917. doi: 10.1038/s41598-020-79548-9.
4
Before < 0.05 to Beyond < 0.05: Using History to Contextualize -Values and Significance Testing.从小于0.05到大于0.05:利用历史对P值及显著性检验进行情境化分析
Am Stat. 2019;73(Suppl 1):82-90. doi: 10.1080/00031305.2018.1537891. Epub 2019 Mar 20.
5
Is It Time to Ban the P Value?是时候禁止使用P值了吗?
JAMA Psychiatry. 2019 Dec 1;76(12):1219-1220. doi: 10.1001/jamapsychiatry.2019.1965.
6
The likelihood ratio and its graphical representation.似然比及其图形表示。
Biochem Med (Zagreb). 2019 Jun 15;29(2):020101. doi: 10.11613/BM.2019.020101. Epub 2019 Apr 15.
7
Redefine statistical significance.重新定义统计学显著性。
Nat Hum Behav. 2018 Jan;2(1):6-10. doi: 10.1038/s41562-017-0189-z.
8
The Proposal to Lower P Value Thresholds to .005.将P值阈值降至0.005的提议。
JAMA. 2018 Apr 10;319(14):1429-1430. doi: 10.1001/jama.2018.1536.
9
Misconceptions, Misuses, and Misinterpretations of P Values and Significance Testing.对 P 值和显著性检验的误解、误用和曲解。
J Bone Joint Surg Am. 2017 Sep 20;99(18):1598-1603. doi: 10.2106/JBJS.16.01314.
10
On determining the most appropriate test cut-off value: the case of tests with continuous results.关于确定最合适的检测临界值:连续结果检测的情况
Biochem Med (Zagreb). 2016 Oct 15;26(3):297-307. doi: 10.11613/BM.2016.034.