Ruzbarsky Joseph J, Khormaee Sariah, Daluiski Aaron
Department of Orthopaedics, Hospital for Special Surgery, New York, NY.
Department of Orthopaedics, Hospital for Special Surgery, New York, NY.
J Hand Surg Am. 2019 Aug;44(8):698.e1-698.e7. doi: 10.1016/j.jhsa.2018.10.005. Epub 2018 Nov 9.
Randomized controlled trials (RCTs) are the gold standard for comparing clinical interventions. Statistical significance as reported via a P value has been used to determine if a difference between clinical interventions exists in an RCT. However, P values do not clearly convey information about the robustness of a study's conclusions. An emerging metric, called the fragility index (the number of subjects who would need to change outcome category to raise the P value above the .05 threshold), is an indirect measure of how likely a repeat of the trial would reach the same conclusions. This study addressed the fragility of RCTs using dichotomous outcomes in hand surgery.
Using systematic searching of the MEDLINE database, we identified hand surgery RCTs published in 11 high-impact journals published in the last decade (2007-2017). Studies were identified that involved 2 parallel arms, allocated patients to treatment and control in a 1:1 ratio, and reported statistical significance for a dichotomous variable. The fragility index was calculated using Fisher's exact test, using previously published methods.
Five hand surgery RCTs were identified for inclusion reporting a range of fragility indices from 0 to 26. Two of the trials (40%) had a fragility index of 2 or less. Two of the trials (40%) reported that the number of patients lost to follow-up exceeded the fragility index, meaning that results of the patients lost to follow-up could theoretically completely reverse the study conclusions.
The range of fragility indices reported in the recent hand surgery literature is consistent with previous reporting within orthopedic surgery.
The fragility index is a useful metric to analyze the robustness of the study conclusions that should complement other methods of critical evaluation including the P value or effect sizes. Our results emphasize the need for future efforts to strengthen the robustness of RCT conclusions.
随机对照试验(RCT)是比较临床干预措施的金标准。通过P值报告的统计学显著性已被用于确定RCT中临床干预措施之间是否存在差异。然而,P值并不能清晰地传达有关研究结论稳健性的信息。一种新兴的指标,称为脆弱性指数(需要改变结局类别以使P值高于0.05阈值的受试者数量),是对重复试验得出相同结论可能性的间接衡量。本研究探讨了手部手术中使用二分结局的RCT的脆弱性。
通过系统检索MEDLINE数据库,我们识别了过去十年(2007 - 2017年)在11种高影响力期刊上发表的手部手术RCT。纳入的研究涉及2个平行组,以1:1的比例将患者分配至治疗组和对照组,并报告了二分变量的统计学显著性。使用先前发表的方法,通过Fisher精确检验计算脆弱性指数。
确定纳入5项手部手术RCT,报告的脆弱性指数范围为0至26。其中2项试验(40%)的脆弱性指数为2或更低。2项试验(40%)报告失访患者数量超过脆弱性指数,这意味着失访患者的结果理论上可能完全逆转研究结论。
近期手部手术文献中报告的脆弱性指数范围与骨科手术先前的报告一致。
脆弱性指数是分析研究结论稳健性的有用指标,应补充包括P值或效应量在内的其他批判性评估方法。我们的结果强调了未来加强RCT结论稳健性的必要性。