Suppr超能文献

多个结果指标需要进行P值调整吗?

Do multiple outcome measures require p-value adjustment?

作者信息

Feise Ronald J

机构信息

Institute of Evidence-Based Chiropractic 6252 Rookery Road, Fort Collins, Colorado 80528, USA.

出版信息

BMC Med Res Methodol. 2002 Jun 17;2:8. doi: 10.1186/1471-2288-2-8.

Abstract

BACKGROUND

Readers may question the interpretation of findings in clinical trials when multiple outcome measures are used without adjustment of the p-value. This question arises because of the increased risk of Type I errors (findings of false "significance") when multiple simultaneous hypotheses are tested at set p-values. The primary aim of this study was to estimate the need to make appropriate p-value adjustments in clinical trials to compensate for a possible increased risk in committing Type I errors when multiple outcome measures are used.

DISCUSSION

The classicists believe that the chance of finding at least one test statistically significant due to chance and incorrectly declaring a difference increases as the number of comparisons increases. The rationalists have the following objections to that theory: 1) P-value adjustments are calculated based on how many tests are to be considered, and that number has been defined arbitrarily and variably; 2) P-value adjustments reduce the chance of making type I errors, but they increase the chance of making type II errors or needing to increase the sample size.

SUMMARY

Readers should balance a study's statistical significance with the magnitude of effect, the quality of the study and with findings from other studies. Researchers facing multiple outcome measures might want to either select a primary outcome measure or use a global assessment measure, rather than adjusting the p-value.

摘要

背景

当使用多个结局指标而未对p值进行校正时,读者可能会质疑临床试验结果的解读。出现这个问题的原因是,当以设定的p值同时检验多个假设时,I类错误(错误地得出“显著性”结果)的风险会增加。本研究的主要目的是评估在临床试验中进行适当的p值校正的必要性,以弥补使用多个结局指标时可能增加的犯I类错误的风险。

讨论

传统主义者认为,由于偶然因素而发现至少一项检验具有统计学显著性并错误地宣称存在差异的可能性会随着比较次数的增加而增加。理性主义者对该理论有以下异议:1)p值校正是根据要考虑的检验数量来计算的,而该数量是任意定义且变化不定的;2)p值校正降低了犯I类错误的可能性,但增加了犯II类错误的可能性或需要增加样本量。

总结

读者应在研究的统计学显著性与效应大小、研究质量以及其他研究结果之间进行权衡。面对多个结局指标的研究人员可能希望要么选择一个主要结局指标,要么使用一个整体评估指标,而不是对p值进行校正。

相似文献

1
Do multiple outcome measures require p-value adjustment?
BMC Med Res Methodol. 2002 Jun 17;2:8. doi: 10.1186/1471-2288-2-8.
2
Statistical estimates and clinical trials.
J Biopharm Stat. 1993 Sep;3(2):249-56. doi: 10.1080/10543409308835063.
3
Is statistical significance always significant?
Nutr Clin Pract. 2005 Jun;20(3):303-7. doi: 10.1177/0115426505020003303.
7
Estimating significance level and power comparisons for testing multiple endpoints in clinical trials.
Control Clin Trials. 2000 Aug;21(4):313-29. doi: 10.1016/s0197-2456(00)00049-0.
9
Multiple comparisons: To compare or not to compare, that is the question.
Res Social Adm Pharm. 2022 Feb;18(2):2331-2334. doi: 10.1016/j.sapharm.2021.07.006. Epub 2021 Jul 8.
10
Test for the consistency of noninferiority from multiple clinical trials.
J Biopharm Stat. 2007;17(2):265-78. doi: 10.1080/10543400601177400.

引用本文的文献

1
Exploring the Relation between Contextual Social Determinants of Health and COVID-19 Occurrence and Hospitalization.
Informatics (MDPI). 2024 Mar;11(1). doi: 10.3390/informatics11010004. Epub 2024 Jan 15.
5
Aggression as a contributing factor to social defeat and stress vulnerability.
Neurobiol Stress. 2025 Apr 23;36:100728. doi: 10.1016/j.ynstr.2025.100728. eCollection 2025 May.
6
Prevalence and characteristics of metaraminol usage in a large intensive care patient cohort. A multicentre, retrospective, observational study.
Crit Care Resusc. 2025 Jun 23;27(2):100112. doi: 10.1016/j.ccrj.2025.100112. eCollection 2025 Jun.
7
Impact of group training on compassion, empathy, and stigmatizing thoughts: a diversity, equity, and inclusion pilot RCT.
Front Psychol. 2025 Jun 20;16:1547645. doi: 10.3389/fpsyg.2025.1547645. eCollection 2025.
9
Acute carbamoylated erythropoietin reduces social stress-induced anxiety and depression-related behaviors.
Neuropharmacology. 2025 Jun 11;278:110558. doi: 10.1016/j.neuropharm.2025.110558.

本文引用的文献

5
Multiple comparison procedures updated.
Clin Exp Pharmacol Physiol. 1998 Dec;25(12):1032-7. doi: 10.1111/j.1440-1681.1998.tb02179.x.
6
Multiple comparisons, explained.
Am J Epidemiol. 1998 May 1;147(9):807-12; discussion 815. doi: 10.1093/oxfordjournals.aje.a009531.
7
Invited commentary: Re: "Multiple comparisons and related issues in the interpretation of epidemiologic data".
Am J Epidemiol. 1998 May 1;147(9):801-6. doi: 10.1093/oxfordjournals.aje.a009530.
8
What's wrong with Bonferroni adjustments.
BMJ. 1998 Apr 18;316(7139):1236-8. doi: 10.1136/bmj.316.7139.1236.
10
Some statistical methods for multiple endpoints in clinical trials.
Control Clin Trials. 1997 Jun;18(3):204-21. doi: 10.1016/s0197-2456(96)00129-8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验