Oosterhoff Jacobien H F, Gravesteijn Benjamin Y, Karhade Aditya V, Jaarsma Ruurd L, Kerkhoffs Gino M M J, Ring David, Schwab Joseph H, Steyerberg Ewout W, Doornberg Job N
Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts.
Department of Orthopaedic Surgery, Amsterdam Movement Sciences, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, the Netherlands.
J Bone Joint Surg Am. 2022 Mar 16;104(6):544-551. doi: 10.2106/JBJS.21.00341.
Statistical models using machine learning (ML) have the potential for more accurate estimates of the probability of binary events than logistic regression. The present study used existing data sets from large musculoskeletal trauma trials to address the following study questions: (1) Do ML models produce better probability estimates than logistic regression models? (2) Are ML models influenced by different variables than logistic regression models?
We created ML and logistic regression models that estimated the probability of a specific fracture (posterior malleolar involvement in distal spiral tibial shaft and ankle fractures, scaphoid fracture, and distal radial fracture) or adverse event (subsequent surgery [after distal biceps repair or tibial shaft fracture], surgical site infection, and postoperative delirium) using 9 data sets from published musculoskeletal trauma studies. Each data set was split into training (80%) and test (20%) subsets. Fivefold cross-validation of the training set was used to develop the ML models. The best-performing model was then assessed in the independent testing data. Performance was assessed by (1) discrimination (c-statistic), (2) calibration (slope and intercept), and (3) overall performance (Brier score).
The mean c-statistic was 0.01 higher for the logistic regression models compared with the best ML models for each data set (range, -0.01 to 0.06). There were fewer variables strongly associated with variation in the ML models, and many were dissimilar from those in the logistic regression models.
The observation that ML models produce probability estimates comparable with logistic regression models for binary events in musculoskeletal trauma suggests that their benefit may be limited in this context.
与逻辑回归相比,使用机器学习(ML)的统计模型在更准确估计二元事件概率方面具有潜力。本研究使用大型肌肉骨骼创伤试验的现有数据集来解决以下研究问题:(1)ML模型在概率估计方面是否比逻辑回归模型表现更好?(2)与逻辑回归模型相比,ML模型是否受不同变量的影响?
我们创建了ML模型和逻辑回归模型,使用已发表的肌肉骨骼创伤研究中的9个数据集来估计特定骨折(胫骨远端螺旋骨折和踝关节骨折中的后踝受累、舟骨骨折和桡骨远端骨折)或不良事件(肱二头肌远端修复或胫骨干骨折后的后续手术、手术部位感染和术后谵妄)的概率。每个数据集被分为训练子集(80%)和测试子集(20%)。使用训练集的五重交叉验证来开发ML模型。然后在独立测试数据中评估表现最佳的模型。通过以下方式评估性能:(1)区分度(c统计量)、(2)校准度(斜率和截距)和(3)整体性能(布里尔评分)。
与每个数据集的最佳ML模型相比,逻辑回归模型的平均c统计量高0.01(范围为-0.01至0.06)。与ML模型变异密切相关的变量较少,且许多与逻辑回归模型中的变量不同。
在肌肉骨骼创伤中,ML模型对二元事件的概率估计与逻辑回归模型相当,这一观察结果表明,在此背景下其优势可能有限。