随机临床试验结果的再解读。

Reinterpretation of the results of randomized clinical trials.

机构信息

Global Virus Network, Middle East Region, Shiraz, Iran.

出版信息

PLoS One. 2024 Jun 14;19(6):e0305575. doi: 10.1371/journal.pone.0305575. eCollection 2024.

DOI:10.1371/journal.pone.0305575

PMID:38875254

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11178203/

Abstract

BACKGROUND

Randomized clinical trials (RCTs) shape our clinical practice. Several studies report a mediocre replicability rate of the studied RCTs. Many researchers believe that the relatively low replication rate of RCTs is attributed to the high p value significance threshold. To solve this problem, some researchers proposed using a lower threshold, which is inevitably associated with a decrease in the study power.

METHODS

The results of 22 500 RCTs retrieved from the Cochrane Database of Systematic Reviews (CDSR) were reinterpreted using 2 fixed p significance threshold (0.05 and 0.005), and a recently proposed flexible threshold that minimizes the weighted sum of errors in statistical inference.

RESULTS

With p < 0.05 criterion, 28.5% of RCTs were significant; p < 0.005, 14.2%; and p < flexible threshold, 9.9% (2/3 of significant RCTs based on p < 0.05 criterion, were found not significant). Lowering the p cut-off, although decreases the false-positive rate, is not generally associated with a lower weighted sum of errors; the false-negative rate increases (the study power decreases); important treatments may be left undiscovered. Accurate calculation of the optimal p value thresholds needs knowledge of the variance in each study arm, a posteriori.

CONCLUSIONS

Lowering the p value threshold, as it is proposed by some researchers, is not reasonable as it might be associated with an increase in false-negative rate. Using a flexible p significance threshold approach, although results in a minimum error in statistical inference, might not be good enough too because only a rough estimation may be calculated a priori; the data necessary for the precise computation of the most appropriate p significance threshold are only available a posteriori. Frequentist statistical framework has an inherent conflict. Alternative methods, say Bayesian methods, although not perfect, would be more appropriate for the data analysis of RCTs.

摘要

背景

随机临床试验（RCT）塑造了我们的临床实践。有几项研究报告称，所研究的 RCT 的可重复性中等。许多研究人员认为，RCT 的相对较低的复制率归因于高 p 值显著性阈值。为了解决这个问题，一些研究人员提出使用较低的阈值，这不可避免地会降低研究的效能。

方法

对从 Cochrane 系统评价数据库（CDSR）中检索到的 22500 项 RCT 的结果使用 2 个固定的 p 值显著性阈值（0.05 和 0.005）和最近提出的可灵活调整的最小化统计推断中误差加权和的阈值进行重新解释。

结果

使用 p < 0.05 标准，28.5%的 RCT 是显著的；p < 0.005，14.2%；p < 灵活阈值，9.9%（基于 p < 0.05 标准的 2/3 个显著 RCT 被发现不显著）。降低 p 值截止值，虽然降低了假阳性率，但通常不会与较低的误差加权和相关；假阴性率增加（研究效能降低）；重要的治疗方法可能未被发现。准确计算最佳 p 值阈值需要事先了解每个研究组的方差。

结论

如一些研究人员所建议的那样，降低 p 值阈值是不合理的，因为它可能与假阴性率的增加有关。使用灵活的 p 值显著性阈值方法，尽管在统计推断中产生最小的误差，但也可能不够好，因为只能进行粗略的先验估计；精确计算最合适的 p 值显著性阈值所需的数据仅在后验获得。经典统计框架存在内在冲突。替代方法，如贝叶斯方法，虽然不完美，但更适合 RCT 的数据分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/885b/11178203/83bb470ba56d/pone.0305575.g001.jpg

相似文献

Reinterpretation of the results of randomized clinical trials.

PLoS One. 2024 Jun 14;19(6):e0305575. doi: 10.1371/journal.pone.0305575. eCollection 2024.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Impact of the p-Value Threshold on Interpretation of Trial Outcomes in Obstetrics and Gynecology.

Am J Perinatol. 2021 Oct;38(12):1223-1230. doi: 10.1055/s-0041-1731345. Epub 2021 Jun 24.

On the use of receiver operating characteristic curve analysis to determine the most appropriate p value significance threshold.

J Transl Med. 2024 Jan 4;22(1):16. doi: 10.1186/s12967-023-04827-8.

The Potential Effect of Lowering the Threshold of Statistical Significance From P < .05 to P < .005 in Orthopaedic Sports Medicine.

Arthroscopy. 2021 Apr;37(4):1068-1074. doi: 10.1016/j.arthro.2020.11.041. Epub 2020 Nov 27.

The future of Cochrane Neonatal.

Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

Lowering the threshold of statistical significance in gastroenterology trials.

Indian J Gastroenterol. 2020 Feb;39(1):92-96. doi: 10.1007/s12664-019-01007-9. Epub 2020 Mar 4.

Minimum false-positive risk of primary outcomes and impact of reducing nominal P-value threshold from 0.05 to 0.005 in anaesthesiology randomised clinical trials: a cross-sectional study.

Br J Anaesth. 2023 Apr;130(4):412-420. doi: 10.1016/j.bja.2022.11.001. Epub 2022 Dec 8.

Rehabilitation interventions in randomized controlled trials for low back pain: proof of statistical significance often is not relevant.

Health Qual Life Outcomes. 2019 Jul 22;17(1):127. doi: 10.1186/s12955-019-1196-8.

Patient-Centered Clinical Trial Design for Heart Failure Devices via Bayesian Decision Analysis.

Patient. 2023 Jul;16(4):359-369. doi: 10.1007/s40271-023-00623-0. Epub 2023 Apr 19.

引用本文的文献

On the effect of flexible adjustment of the p value significance threshold on the reproducibility of randomized clinical trials.

PLoS One. 2025 Jun 13;20(6):e0325920. doi: 10.1371/journal.pone.0325920. eCollection 2025.

本文引用的文献

A New Look at P Values for Randomized Clinical Trials.

NEJM Evid. 2024 Jan;3(1):EVIDoa2300003. doi: 10.1056/EVIDoa2300003. Epub 2023 Dec 22.

On the use of receiver operating characteristic curve analysis to determine the most appropriate p value significance threshold.

J Transl Med. 2024 Jan 4;22(1):16. doi: 10.1186/s12967-023-04827-8.

The roles, challenges, and merits of the p value.

Patterns (N Y). 2023 Dec 8;4(12):100878. doi: 10.1016/j.patter.2023.100878.

The statistical properties of RCTs and a proposal for shrinkage.

Stat Med. 2021 Nov 30;40(27):6107-6117. doi: 10.1002/sim.9173. Epub 2021 Aug 23.

Before < 0.05 to Beyond < 0.05: Using History to Contextualize -Values and Significance Testing.

Am Stat. 2019;73(Suppl 1):82-90. doi: 10.1080/00031305.2018.1537891. Epub 2019 Mar 20.

Redefine statistical significance.

Nat Hum Behav. 2018 Jan;2(1):6-10. doi: 10.1038/s41562-017-0189-z.

The Proposal to Lower P Value Thresholds to .005.

JAMA. 2018 Apr 10;319(14):1429-1430. doi: 10.1001/jama.2018.1536.

On determining the most appropriate test cut-off value: the case of tests with continuous results.

Biochem Med (Zagreb). 2016 Oct 15;26(3):297-307. doi: 10.11613/BM.2016.034.

Evolution of Reporting P Values in the Biomedical Literature, 1990-2015.

JAMA. 2016 Mar 15;315(11):1141-8. doi: 10.1001/jama.2016.1952.

Statistical notes for clinical researchers: Type I and type II errors in statistical decision.

Restor Dent Endod. 2015 Aug;40(3):249-52. doi: 10.5395/rde.2015.40.3.249.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

随机临床试验结果的再解读。

Reinterpretation of the results of randomized clinical trials.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献