Institute for Medical and Biomedical Education, St George's, University of London, London, UK.
Department of Obstetrics and Gynecology, Gødstrup Hospital, Gødstrup, Denmark.
Acta Obstet Gynecol Scand. 2022 Jun;101(6):624-627. doi: 10.1111/aogs.14366. Epub 2022 Apr 22.
Traditional null hypothesis significance testing (NHST) incorporating the critical level of significance of 0.05 has become the cornerstone of decision-making in health care, and nowhere less so than in obstetric and gynecological research. However, such practice is controversial. In particular, it was never intended for clinical significance to be inferred from statistical significance. The inference of clinical importance based on statistical significance (p < 0.05), and lack of clinical significance otherwise (p ≥ 0.05) represents misunderstanding of the original purpose of NHST. Furthermore, the limitations of NHST-sensitivity to sample size, plus type I and II errors-are frequently ignored. Therefore, decision-making based on NHST has the potential for recurrent false claims about the effectiveness of interventions or importance of exposure to risk factors, or dismissal of important ones. This commentary presents the history behind NHST along with the limitations that modern-day NHST presents, and suggests that a statistics reform regarding NHST be considered.
传统的零假设显著性检验(NHST)纳入了 0.05 的关键显著性水平,已成为医疗保健决策的基石,尤其是在妇产科研究中更是如此。然而,这种做法存在争议。特别是,从统计学意义推断临床意义从未被认为是合适的。基于统计学显著性(p < 0.05)推断临床重要性,而在其他情况下(p ≥ 0.05)则不具有临床重要性,这代表了对 NHST 原始目的的误解。此外,NHST 对样本量、I 型和 II 型错误的敏感性的局限性经常被忽视。因此,基于 NHST 的决策有可能反复声称干预措施的有效性或暴露于危险因素的重要性,或者忽视重要的因素。本评论介绍了 NHST 的历史背景以及现代 NHST 所呈现的局限性,并建议考虑对 NHST 进行统计学改革。