Suppr超能文献

替代阿尔法阈值对骨科假设检验的影响是什么?

What Are the Implications of Alternative Alpha Thresholds for Hypothesis Testing in Orthopaedics?

机构信息

D. C. Landy, T. J. Utset-Ward, M. J. Lee, University of Chicago, Chicago, IL, USA.

出版信息

Clin Orthop Relat Res. 2019 Oct;477(10):2358-2363. doi: 10.1097/CORR.0000000000000843.

Abstract

BACKGROUND

Clinical research in orthopaedics typically reports the presence of an association after rejecting a null hypothesis of no association using an alpha threshold of 0.05 at which to evaluate a calculated p value. This arbitrary value is a factor that results in the current difficulties reproducing research findings. A proposal is gaining attention to lower the alpha threshold to 0.005. However, it is currently unknown how alpha thresholds are used in orthopaedics and the distribution of p values reported.

QUESTIONS/PURPOSES: We sought to describe the use of alpha thresholds in two orthopaedic journals by asking (1) How frequently are alpha threshold values reported? (2) How frequently are power calculations reported? (3) How frequently are p values between 0.005 and 0.05 reported for the main hypothesis? (4) Are p values less than 0.005 associated with study characteristics such as design and reporting power calculations?

METHODS

The 100 most recent original clinical research articles from two leading orthopaedic journals at the time of this proposal were reviewed. For studies without a specified primary hypothesis, a main hypothesis was selected that was most consistent with the title and abstract. The p value for the main hypothesis and lowest p value for each study were recorded. Study characteristics including details of alpha thresholds, beta, and p values were recorded. Associations between study characteristics and p values were described. Of the 200 articles (100 from each journal), 23 were randomized controlled trials, 141 were cohort studies or case series (defined as a study in which authors had access to original data collected for the study purpose), 31 were database studies, and five were classified as other.

RESULTS

An alpha threshold was reported in 166 articles (83%) with all but two reporting a value 0.05. Forty-two articles (21%) reported performing a power calculation. The p value for the main hypothesis was less than 0.005 for 88 articles (44%), between 0.05 and 0.005 for 67 (34%), and greater than 0.05 for 29 (15%). The smallest p value was between 0.05 and 0.005 for 39 articles (20%), less than 0.005 for 143 (72%), and either not provided or greater than 0.05 for 18 (9%). Although 50% (65 of 130) cohort and database papers had a main hypothesis p value less than 0.005, only 26% (6 of 23) randomized controlled trials did. Only 36% (15 of 42) articles reporting a power calculation had a p value less than 0.005 compared with 51% (73 of 142) that did not report one.

CONCLUSIONS

Although a lower alpha threshold may theoretically increase the reproducibility of research findings across orthopaedics, this would preferentially select findings from lower-quality studies or increase the burden on higher quality ones. A more-nuanced approach could be to consider alpha thresholds specific to study characteristics. For example, randomized controlled trials with a prespecified primary hypothesis may still be best evaluated at 0.05 while database studies with an abundance of statistical tests may be best evaluated at a threshold even below 0.005.

CLINICAL RELEVANCE

Surgeons and scientists in orthopaedics should understand that the default alpha threshold of 0.05 represents an arbitrary value that could be lowered to help reduce type-I errors; however, it must also be appreciated that such a change could increase type-II errors, increase resource utilization, and preferentially select findings from lower-quality studies.

摘要

背景

骨科临床研究通常在拒绝无关联的零假设后,使用 0.05 的 alpha 阈值来评估计算的 p 值,从而报告关联的存在。这种任意的阈值是导致当前难以重现研究结果的一个因素。目前有一种建议是将 alpha 阈值降低到 0.005。然而,目前尚不清楚 alpha 阈值在骨科中的使用情况以及报告的 p 值分布。

问题/目的:我们试图通过以下问题来描述两种骨科期刊中 alpha 阈值的使用情况:(1)alpha 阈值值报告的频率是多少?(2)功率计算报告的频率是多少?(3)主要假设的 p 值在 0.005 到 0.05 之间报告的频率是多少?(4)p 值小于 0.005 是否与研究设计和报告功率计算等特征有关?

方法

回顾了当时提出该建议的两种领先骨科期刊的 100 篇最新原始临床研究文章。对于没有指定主要假设的研究,选择了与标题和摘要最一致的主要假设。记录了主要假设的 p 值和每项研究的最低 p 值。记录了研究特征,包括 alpha 阈值、β和 p 值的详细信息。描述了研究特征与 p 值之间的关联。在这 200 篇文章(每个期刊各 100 篇)中,23 篇为随机对照试验,141 篇为队列研究或病例系列(定义为作者可以访问为研究目的而收集的原始数据的研究),31 篇为数据库研究,5 篇为其他类型。

结果

166 篇文章(83%)报告了 alpha 阈值,除了两篇文章外,所有文章都报告了 0.05 的值。42 篇文章(21%)报告了进行功率计算。88 篇文章(44%)的主要假设 p 值小于 0.005,67 篇文章(34%)的 p 值在 0.05 和 0.005 之间,29 篇文章(15%)的 p 值大于 0.05。最小 p 值在 0.05 和 0.005 之间的有 39 篇文章(20%),小于 0.005 的有 143 篇文章(72%),没有提供或大于 0.05 的有 18 篇文章(9%)。虽然 50%(130 篇中的 65 篇)队列和数据库论文的主要假设 p 值小于 0.005,但只有 26%(23 篇中的 6 篇)随机对照试验的 p 值小于 0.005。报告了功率计算的文章中 p 值小于 0.005 的比例为 36%(42 篇中的 15 篇),而没有报告功率计算的文章中 p 值小于 0.005 的比例为 51%(142 篇中的 73 篇)。

结论

虽然理论上降低 alpha 阈值可能会增加骨科研究结果的可重复性,但这可能会优先选择低质量研究的发现,或增加高质量研究的负担。一种更细致的方法可以考虑针对研究特征的 alpha 阈值。例如,具有预设主要假设的随机对照试验可能仍以 0.05 进行最佳评估,而具有大量统计检验的数据库研究可能以甚至低于 0.005 的阈值进行最佳评估。

临床意义

骨科的外科医生和科学家应该了解,0.05 的默认 alpha 阈值是一个任意值,可降低以帮助减少 I 型错误;然而,也必须认识到,这种变化可能会增加 II 型错误,增加资源利用,并优先选择低质量研究的发现。

相似文献

1
What Are the Implications of Alternative Alpha Thresholds for Hypothesis Testing in Orthopaedics?
Clin Orthop Relat Res. 2019 Oct;477(10):2358-2363. doi: 10.1097/CORR.0000000000000843.
3
How Has Statistical Testing in Orthopedics Changed Over Time? An Assessment of High Impact Journals Over 25 Years.
J Surg Educ. 2023 Jul;80(7):1046-1052. doi: 10.1016/j.jsurg.2023.04.006. Epub 2023 May 2.
6
Low power and type II errors in recent ophthalmology research.
Can J Ophthalmol. 2016 Oct;51(5):368-372. doi: 10.1016/j.jcjo.2016.02.002. Epub 2016 Sep 3.
7
Evolution of Reporting P Values in the Biomedical Literature, 1990-2015.
JAMA. 2016 Mar 15;315(11):1141-8. doi: 10.1001/jama.2016.1952.
9
10
Type-II error rates (beta errors) of randomized trials in orthopaedic trauma.
J Bone Joint Surg Am. 2001 Nov;83(11):1650-5. doi: 10.2106/00004623-200111000-00005.

引用本文的文献

1
Randomized Controlled Trial of Irrigation-Coupled Bipolar Electrocautery Versus Tourniquet in Total Knee Arthroplasty.
Arthroplast Today. 2024 Apr 16;27:101364. doi: 10.1016/j.artd.2024.101364. eCollection 2024 Jun.
3
Are Oblique Views Necessary? A Review of the Clinical Value of Oblique Knee Radiographs in the Acute Setting.
West J Emerg Med. 2022 Oct 24;23(6):939-946. doi: 10.5811/westjem.2022.8.56453.
4
CORR Insights®: What Are the Implications of Alternative Alpha Thresholds for Hypothesis Testing in Orthopaedics?
Clin Orthop Relat Res. 2019 Oct;477(10):2364-2366. doi: 10.1097/CORR.0000000000000870.

本文引用的文献

1
Redefine statistical significance.
Nat Hum Behav. 2018 Jan;2(1):6-10. doi: 10.1038/s41562-017-0189-z.
3
Lowering the P Value Threshold.
JAMA. 2018 Sep 4;320(9):935. doi: 10.1001/jama.2018.8733.
4
Editorial: Threshold P Values in Orthopaedic Research-We Know the Problem. What is the Solution?
Clin Orthop Relat Res. 2018 Sep;476(9):1689-1691. doi: 10.1097/CORR.0000000000000413.
5
The Proposal to Lower P Value Thresholds to .005.
JAMA. 2018 Apr 10;319(14):1429-1430. doi: 10.1001/jama.2018.1536.
6
Misconceptions, Misuses, and Misinterpretations of P Values and Significance Testing.
J Bone Joint Surg Am. 2017 Sep 20;99(18):1598-1603. doi: 10.2106/JBJS.16.01314.
7
Update on Trial Registration 11 Years after the ICMJE Policy Was Established.
N Engl J Med. 2017 Jan 26;376(4):383-391. doi: 10.1056/NEJMsr1601330.
10
Keeping track of trials.
Am J Sports Med. 2012 Sep;40(9):1967-9. doi: 10.1177/0363546512459510.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验