IRCCS Istituto Ortopedico Galeazzi, Unit of Clinical Epidemiology, Milan, Italy.
Department of Biomedical Sciences for Health, University of Milan, Milan, Italy.
Health Qual Life Outcomes. 2019 Jul 22;17(1):127. doi: 10.1186/s12955-019-1196-8.
An observed statistically significant difference between two interventions does not necessarily imply that this difference is clinically important for patients and clinicians. We aimed to assess if treatment effects of randomized controlled trials (RCTs) for low back pain (LBP) are statistically significant and clinically relevant, and if RCTs were powered to achieve clinically relevant differences on continuous outcomes.
We searched for all RCTs included in Cochrane Systematic Reviews focusing on the efficacy of rehabilitation interventions for LBP and published until April 2017. RCTs having sample size calculation and a planned minimal important difference were considered. In the primary analysis, we calculated the proportion of RCTs classified as "statistically significant and clinically relevant", "statistically significant but not clinically relevant", "not statistically significant but clinically relevant", and "not statistically significant and not clinically relevant". Then, we investigated how many times the mismatch between statistical significance and clinical relevance was due to inadequate power.
From 20 eligible SRs including 101 RCTs, we identified 42 RCTs encompassing 81 intervention comparisons. Overall, 60% (25 RCTs) were statistically significant while only 36% (15 RCTs) were both statistically and clinically significant. Most trials (38%) did not discuss the clinical relevance of treatment effects when results did not reached statistical significance. Among trials with non-statistically significant findings, 60% did not reach the planned sample size, therefore being at risk to not detect an effect that is actually there (type II error).
Only a minority of positive RCT findings was both statistically significant and clinically relevant. Scarce diligence or frank omissions of important tactic elements of RCTs, such as clinical relevance, and power, decrease the reliability of study findings to current practice.
干预措施之间的观察到的统计学显著差异并不一定意味着对患者和临床医生而言,这种差异具有临床意义。我们旨在评估针对腰痛(LBP)的随机对照试验(RCT)的治疗效果是否具有统计学意义和临床相关性,以及 RCT 是否有能力在连续结果上实现具有临床意义的差异。
我们搜索了截至 2017 年 4 月发表的所有针对 LBP 康复干预效果的 Cochrane 系统评价中包含的 RCT。考虑了具有样本量计算和计划最小临床重要差异的 RCT。在主要分析中,我们计算了被分类为“统计学上显著且具有临床意义”、“统计学上显著但不具有临床意义”、“无统计学意义但具有临床意义”和“无统计学意义且无临床意义”的 RCT 的比例。然后,我们调查了统计意义和临床相关性之间的不匹配有多少次是由于效力不足所致。
从 20 项符合条件的 SR 中包含的 101 项 RCT 中,我们确定了 42 项 RCT,共涵盖 81 项干预比较。总体而言,60%(25 项 RCT)具有统计学意义,而只有 36%(15 项 RCT)具有统计学意义和临床意义。当结果未达到统计学意义时,大多数试验(38%)没有讨论治疗效果的临床相关性。在无统计学意义的发现中,60%的试验未达到计划的样本量,因此有可能无法检测到实际上存在的效果(第二类错误)。
仅有少数阳性 RCT 结果在统计学上和临床上均具有显著意义。对 RCT 中诸如临床相关性和效力等重要策略要素的重视不足或坦率的忽略,降低了研究结果对当前实践的可靠性。