From the Department of Orthopaedic Surgery, Boston Medical Center, Boston, MA (Dr. Parisien, Dr. Dashe, Dr. Curry, and Dr. Li), the Department of Orthopaedic Surgery, Columbia University Medical Center, New York, NY (Dr. Trofa), the Department of Orthopaedic Surgery, Massachusetts General Hospital & Harvard Medical School, Boston, MA (Dr. Cronin), and the Department of Orthopaedic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA (Dr. Fu).
J Am Acad Orthop Surg. 2019 Apr 1;27(7):e324-e329. doi: 10.5435/JAAOS-D-17-00636.
Comparative trials evaluating categorical outcomes have important implications on surgical decision making. The purpose of this study was to examine the statistical stability of sports medicine research.
Comparative clinical sports medicine research studies involving anterior cruciate ligament, meniscus, and knee instability were reviewed in two journals between 2006 and 2016. The statistical stability for each study outcome was determined by the number of event reversals required to change the P value to either greater or less than 0.05. The number of patients lost to follow-up was also determined.
Of the 1,505 studies screened, 102 studies were included for analysis, 40 of which were randomized controlled trials. There were 339 total outcome events, with 98 significant and 241 not significant. The Fragility Index, or the median number of events required to change the statistical significance of the overall study, was five (interquartile range, 3 to 8) or 5.4% of the total study population. In addition, the average number of patients lost to follow-up was 7.9, which is greater than the number needed to change the significance of each study arm and the entire study population.
Results in the comparative sports medicine literature may not be as stable as previously thought, with only a small percentage of outcome events needed to change study significance. Outcomes research based on a single discreet P value cutoff may be misleading.
评估分类结局的对照试验对手术决策有重要影响。本研究旨在检验运动医学研究的统计学稳定性。
在 2006 年至 2016 年期间,对两份期刊中的前交叉韧带、半月板和膝关节不稳定的比较临床运动医学研究进行了回顾。通过需要多少个事件逆转才能将 P 值改变为大于或小于 0.05 来确定每个研究结果的统计稳定性。还确定了失访患者的数量。
在筛选出的 1505 项研究中,有 102 项研究被纳入分析,其中 40 项为随机对照试验。共有 339 个总结局事件,其中 98 个有显著意义,241 个无显著意义。脆弱指数,或改变整体研究统计学意义所需的中位数事件数,为 5(四分位距,3 至 8)或 5.4%的总研究人群。此外,平均有 7.9 名患者失访,这大于改变每个研究臂和整个研究人群意义所需的数量。
在比较运动医学文献中的结果可能不如以前认为的那么稳定,只有很小比例的结局事件需要改变研究意义。基于单个离散 P 值截止值的结果研究可能具有误导性。