多个结果指标需要进行P值调整吗？

Do multiple outcome measures require p-value adjustment?

作者信息

Feise Ronald J

机构信息

Institute of Evidence-Based Chiropractic 6252 Rookery Road, Fort Collins, Colorado 80528, USA.

出版信息

BMC Med Res Methodol. 2002 Jun 17;2:8. doi: 10.1186/1471-2288-2-8.

DOI:10.1186/1471-2288-2-8

PMID:12069695

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC117123/

Abstract

BACKGROUND

Readers may question the interpretation of findings in clinical trials when multiple outcome measures are used without adjustment of the p-value. This question arises because of the increased risk of Type I errors (findings of false "significance") when multiple simultaneous hypotheses are tested at set p-values. The primary aim of this study was to estimate the need to make appropriate p-value adjustments in clinical trials to compensate for a possible increased risk in committing Type I errors when multiple outcome measures are used.

DISCUSSION

The classicists believe that the chance of finding at least one test statistically significant due to chance and incorrectly declaring a difference increases as the number of comparisons increases. The rationalists have the following objections to that theory: 1) P-value adjustments are calculated based on how many tests are to be considered, and that number has been defined arbitrarily and variably; 2) P-value adjustments reduce the chance of making type I errors, but they increase the chance of making type II errors or needing to increase the sample size.

SUMMARY

Readers should balance a study's statistical significance with the magnitude of effect, the quality of the study and with findings from other studies. Researchers facing multiple outcome measures might want to either select a primary outcome measure or use a global assessment measure, rather than adjusting the p-value.

摘要

背景

当使用多个结局指标而未对p值进行校正时，读者可能会质疑临床试验结果的解读。出现这个问题的原因是，当以设定的p值同时检验多个假设时，I类错误（错误地得出“显著性”结果）的风险会增加。本研究的主要目的是评估在临床试验中进行适当的p值校正的必要性，以弥补使用多个结局指标时可能增加的犯I类错误的风险。

讨论

传统主义者认为，由于偶然因素而发现至少一项检验具有统计学显著性并错误地宣称存在差异的可能性会随着比较次数的增加而增加。理性主义者对该理论有以下异议：1）p值校正是根据要考虑的检验数量来计算的，而该数量是任意定义且变化不定的；2）p值校正降低了犯I类错误的可能性，但增加了犯II类错误的可能性或需要增加样本量。

总结

读者应在研究的统计学显著性与效应大小、研究质量以及其他研究结果之间进行权衡。面对多个结局指标的研究人员可能希望要么选择一个主要结局指标，要么使用一个整体评估指标，而不是对p值进行校正。

相似文献

Do multiple outcome measures require p-value adjustment?

BMC Med Res Methodol. 2002 Jun 17;2:8. doi: 10.1186/1471-2288-2-8.

Statistical estimates and clinical trials.

J Biopharm Stat. 1993 Sep;3(2):249-56. doi: 10.1080/10543409308835063.

Is statistical significance always significant?

Nutr Clin Pract. 2005 Jun;20(3):303-7. doi: 10.1177/0115426505020003303.

A sample size planning approach that considers both statistical significance and clinical significance.

Trials. 2015 May 12;16:213. doi: 10.1186/s13063-015-0727-9.

Multiplicity-adjusted sample size requirements: a strategy to maintain statistical power with Bonferroni adjustments.

J Clin Psychiatry. 2004 Nov;65(11):1511-4.

[P value and confidence intervals: reporting and interpreting the result of a clinical study].

G Ital Nefrol. 2006 Sep-Oct;23(5):490-501.

Estimating significance level and power comparisons for testing multiple endpoints in clinical trials.

Control Clin Trials. 2000 Aug;21(4):313-29. doi: 10.1016/s0197-2456(00)00049-0.

Clinical versus statistical significance: interpreting P values and confidence intervals related to measures of association to guide decision making.

J Pharm Pract. 2010 Aug;23(4):344-51. doi: 10.1177/0897190009358774. Epub 2010 Apr 13.

Multiple comparisons: To compare or not to compare, that is the question.

Res Social Adm Pharm. 2022 Feb;18(2):2331-2334. doi: 10.1016/j.sapharm.2021.07.006. Epub 2021 Jul 8.

Test for the consistency of noninferiority from multiple clinical trials.

J Biopharm Stat. 2007;17(2):265-78. doi: 10.1080/10543400601177400.

引用本文的文献

Exploring the Relation between Contextual Social Determinants of Health and COVID-19 Occurrence and Hospitalization.

Informatics (MDPI). 2024 Mar;11(1). doi: 10.3390/informatics11010004. Epub 2024 Jan 15.

Preliminary effectiveness and feasibility of an integrated hope techniques and narrative-based card game intervention for pediatric cancer patients in China: a randomized controlled trial.

BMC Med. 2025 Jul 31;23(1):449. doi: 10.1186/s12916-025-04287-5.

School Climate and Black Adolescents' Psychological Functioning: The Roles of Parental Self-Efficacy and Parenting Practices.

Behav Sci (Basel). 2025 Jul 10;15(7):933. doi: 10.3390/bs15070933.

(Mis)matches in daily weight stigma perpetrators' and targets' genders and races relative to targets' daily disordered eating behaviors: Examining differences between Black and White women.

J Acad Nutr Diet. 2025 Jul 18. doi: 10.1016/j.jand.2025.07.002.

Aggression as a contributing factor to social defeat and stress vulnerability.

Neurobiol Stress. 2025 Apr 23;36:100728. doi: 10.1016/j.ynstr.2025.100728. eCollection 2025 May.

Prevalence and characteristics of metaraminol usage in a large intensive care patient cohort. A multicentre, retrospective, observational study.

Crit Care Resusc. 2025 Jun 23;27(2):100112. doi: 10.1016/j.ccrj.2025.100112. eCollection 2025 Jun.

Impact of group training on compassion, empathy, and stigmatizing thoughts: a diversity, equity, and inclusion pilot RCT.

Front Psychol. 2025 Jun 20;16:1547645. doi: 10.3389/fpsyg.2025.1547645. eCollection 2025.

A randomized controlled trial into the effectiveness of a mobile health application (SAM) to reduce stress and improve well-being in autistic adults.

Autism. 2025 Jun 26;29(10):13623613251346885. doi: 10.1177/13623613251346885.

Acute carbamoylated erythropoietin reduces social stress-induced anxiety and depression-related behaviors.

Neuropharmacology. 2025 Jun 11;278:110558. doi: 10.1016/j.neuropharm.2025.110558.

Impact of the COVID-19 Pandemic on the Everyday Life and Healthcare of Patients with Congenital Heart Defects: Insights from Pandemic Onset to One Year Later.

J Clin Med. 2025 May 15;14(10):3462. doi: 10.3390/jcm14103462.

本文引用的文献

Behavioral-graded activity compared with usual care after first-time disk surgery: considerations of the design of a randomized clinical trial.

J Manipulative Physiol Ther. 2001 Jan;24(1):67-8. doi: 10.1067/mmt.2001.112007.

Empirical Bayes adjustments for multiple results in hypothesis-generating or surveillance studies.

Cancer Epidemiol Biomarkers Prev. 2000 Sep;9(9):895-903.

Selection of an adaptive test statistic for use with multiple comparison analyses of neuroimaging data.

Neuroimage. 2000 Aug;12(2):219-29. doi: 10.1006/nimg.2000.0608.

Behavioral-graded activity compared with usual care after first-time disk surgery: considerations of the design of a randomized clinical trial.

J Manipulative Physiol Ther. 2000 Jun;23(5):312-9.

Multiple comparison procedures updated.

Clin Exp Pharmacol Physiol. 1998 Dec;25(12):1032-7. doi: 10.1111/j.1440-1681.1998.tb02179.x.

Multiple comparisons, explained.

Am J Epidemiol. 1998 May 1;147(9):807-12; discussion 815. doi: 10.1093/oxfordjournals.aje.a009531.

Invited commentary: Re: "Multiple comparisons and related issues in the interpretation of epidemiologic data".

Am J Epidemiol. 1998 May 1;147(9):801-6. doi: 10.1093/oxfordjournals.aje.a009530.

What's wrong with Bonferroni adjustments.

BMJ. 1998 Apr 18;316(7139):1236-8. doi: 10.1136/bmj.316.7139.1236.

How to read a paper. Statistics for the non-statistician. I: Different types of data need different statistical tests.

BMJ. 1997 Aug 9;315(7104):364-6. doi: 10.1136/bmj.315.7104.364.

Some statistical methods for multiple endpoints in clinical trials.

Control Clin Trials. 1997 Jun;18(3):204-21. doi: 10.1016/s0197-2456(96)00129-8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

多个结果指标需要进行P值调整吗？

Do multiple outcome measures require p-value adjustment?

作者信息

机构信息

出版信息

BACKGROUND

DISCUSSION

SUMMARY

背景

讨论

总结

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献