Division of Biostatistics, Institute of Social and Preventive Medicine, University of Zurich, Zurich, Switzerland.
Clin Trials. 2013 Apr;10(2):236-42. doi: 10.1177/1740774512468807. Epub 2013 Jan 17.
Misunderstanding of significance tests and P values is widespread in clinical research and elsewhere.
To assess the implications of two common mistakes in the interpretation of statistical significance tests. The first one is the misinterpretation of the type I error rate as the expected proportion of false-positive results among all those called significant, also known as the false-positive report probability (FPRP). The second is the misinterpretation of a P value as (posterior) probability of the null hypothesis.
A reverse-Bayes approach is used to calculate a lower bound on the proportion of truly effective treatments that would ensure the FPRP to be equal or below the type I error rate. A reverse-Bayes approach using minimum Bayes factors (BFs) yields upper bounds on the prior probability of the null hypothesis that would justify the interpretation of the P value as the posterior probability of the null hypothesis.
In a typical clinical trials setting, more than 50% of the treatments need to be truly effective to justify equality of the type I error rate and the FPRP. To interpret the P value as posterior probability, the difference between the corresponding prior probability and the P value cannot exceed 12.4 percentage points.
The first analysis requires that the (one-sided) type I error rate is smaller than the type II error rate. The second result is valid under different scenarios describing how to transform P values to minimum BFs.
The two misinterpretations imply strong and often unrealistic assumptions on the prior proportion or probability of truly effective treatments.
在临床研究和其他领域,对统计显著性检验和 P 值的误解非常普遍。
评估两种常见的统计显著性检验解释错误的含义。第一种错误是将Ⅰ类错误率误解为所有被称为显著的结果中假阳性结果的预期比例,也称为假阳性报告概率(FPRP)。第二种错误是将 P 值误解为零假设的后验概率。
使用逆贝叶斯方法计算确保 FPRP 等于或低于Ⅰ类错误率所需的真正有效治疗方法的比例的下限。使用最小贝叶斯因子(BF)的逆贝叶斯方法得出可以证明将 P 值解释为零假设后验概率的零假设先验概率的上限。
在典型的临床试验设置中,需要超过 50%的治疗方法是真正有效的,才能使Ⅰ类错误率和 FPRP 相等。要将 P 值解释为后验概率,相应的先验概率与 P 值之间的差异不能超过 12.4 个百分点。
第一种分析要求(单边)Ⅰ类错误率小于Ⅱ类错误率。第二个结果在描述如何将 P 值转换为最小 BF 的不同场景下是有效的。
这两种误解意味着对真正有效的治疗方法的先验比例或概率存在强烈且往往不切实际的假设。