Department of Anesthesiology and Critical Care, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
Department of Anesthesiology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA.
Br J Anaesth. 2022 Nov;129(5):643-646. doi: 10.1016/j.bja.2022.06.023. Epub 2022 Jul 22.
We discuss a newly published study examining how phrases are used in clinical trials to describe results when the estimated P-value is close to (slightly above or slightly below) 0.05, which has been arbitrarily designated by convention as the boundary for 'statistical significance'. Terms such as 'marginally significant', 'trending towards significant', and 'nominally significant' are well represented in biomedical literature, but are not actually scientifically meaningful. Acknowledging that 'statistical significance' remains a major determinant of publication, we propose that scientific journals de-emphasise the use of P-values for null hypothesis significance testing, a purpose for which they were never intended, and avoid the use of these ambiguous and confusing terms in scientific articles. Instead, investigators could simply report their findings: effect sizes, P-values, and confidence intervals (or their Bayesian equivalents), and leave it to the discerning reader to infer the clinical applicability and importance. Our goal should be to move away from describing studies (or trials) as positive or negative based on an arbitrary P-value threshold, and rather to judge whether the scientific evidence provided is informative or uninformative.
我们讨论了一项新发表的研究,该研究探讨了当估计的 P 值接近(略高于或略低于)0.05 时,临床试验中如何使用短语来描述结果,这一值是根据惯例任意指定为“统计学意义”的界限。在生物医学文献中,“略有意义”、“有向显著趋势”和“名义上显著”等术语得到了很好的体现,但实际上并没有科学意义。我们承认“统计学意义”仍然是发表的主要决定因素,因此建议科学期刊不再强调使用 P 值进行零假设显著性检验,因为这不是它们的初衷,并且避免在科学文章中使用这些模糊和混淆的术语。相反,研究人员可以简单地报告他们的发现:效应大小、P 值和置信区间(或它们的贝叶斯等效物),并让有识之士自行推断临床适用性和重要性。我们的目标应该是不再根据任意的 P 值阈值来描述研究(或试验)为阳性或阴性,而是判断提供的科学证据是否有信息或无信息。