一种用于预测脊柱和骶骨脊索瘤生存结果的可解释机器学习方法。

An Interpretable Machine Learning Approach to Predict Survival Outcomes in Spinal and Sacropelvic Chordomas.

作者信息

Karabacak Mert, Carr Matthew T, Schupper Alexander J, Bhimani Abhiraj D, Steinberger Jeremy, Margetis Konstantinos

机构信息

Department of Neurosurgery, Mount Sinai Health System, New York, NY.

出版信息

Spine (Phila Pa 1976). 2024 Apr 12. doi: 10.1097/BRS.0000000000005002.

STUDY DESIGN

Retrospective, population-based cohort study.

OBJECTIVE

This study aimed to develop machine learning (ML) models to predict five-year and 10-year mortality in spinal and sacropelvic chordoma patients and integrate them into a web application for enhanced prognostication.

SUMMARY OF BACKGROUND DATA

Past research has uncovered factors influencing survival in spinal chordoma patients. While identifying individual predictors is important, personalized survival predictions are equally vital. Though prior efforts have resulted in nomograms aiming to serve this purpose, they cannot capture complex interactions within data and rely on statistical assumptions that may not fit real-world data.

METHODS

Adult spinal and sacropelvic chordoma patients were identified from the National Cancer Database. Sociodemographic, clinicopathologic, diagnostic, and treatment-related variables were utilized as predictive features. Five supervised ML algorithms (TabPFN, CatBoost, XGBoost, LightGBM, and Random Forest) were implemented to predict mortality at five and 10 years postdiagnosis. Model performance was primarily evaluated using the area under the receiver operating characteristic (AUROC). SHapley Additive exPlanations (SHAP) values and partial dependence plots provided feature importance and interpretability. The top models were integrated into a web application.

RESULTS

From the NCDB, 1206 adult patients diagnosed with histologically confirmed spinal and sacropelvic chordomas were retrieved for the five-year mortality outcome [423 (35.1%) with five-year mortality] and 801 patients for the 10-year mortality outcome [588 (73.4%) with 10-year mortality]. Top-performing models for both of the outcomes were the models created with the CatBoost algorithm. The CatBoost model for five-year mortality predictions displayed a mean AUROC of 0.801, and the CatBoost model predicting 10-year mortality yielded a mean AUROC of 0.814.

CONCLUSIONS

This study developed ML models that can accurately predict five-year to 10-year survival probabilities in spinal chordoma patients. Integrating these interpretable, personalized prognostic models into a web application provides quantitative survival estimates for a given patient. The local interpretability enables transparency into how predictions are influenced. Further external validation is warranted to support generalizability and clinical utility.

研究设计

基于人群的回顾性队列研究。

目的

本研究旨在开发机器学习（ML）模型，以预测脊柱和骶骨脊索瘤患者的5年和10年死亡率，并将其整合到一个网络应用程序中，以增强预后评估。

背景数据总结

过去的研究已经发现了影响脊柱脊索瘤患者生存的因素。虽然识别个体预测因素很重要，但个性化的生存预测同样至关重要。尽管先前的努力已经产生了旨在实现这一目的的列线图，但它们无法捕捉数据中的复杂相互作用，并且依赖于可能不符合实际数据的统计假设。

方法

从国家癌症数据库中识别出成年脊柱和骶骨脊索瘤患者。将社会人口统计学、临床病理、诊断和治疗相关变量用作预测特征。实施了五种监督式ML算法（TabPFN、CatBoost、XGBoost、LightGBM和随机森林）来预测诊断后5年和10年的死亡率。模型性能主要使用受试者操作特征曲线下面积（AUROC）进行评估。SHapley加性解释（SHAP）值和部分依赖图提供了特征重要性和可解释性。顶级模型被整合到一个网络应用程序中。

结果

从国家癌症数据库中，检索出1206例经组织学确诊的成年脊柱和骶骨脊索瘤患者用于5年死亡率结果分析[423例（35.1%）有5年死亡率]，801例患者用于10年死亡率结果分析[588例（73.4%）有10年死亡率]。两个结果的顶级模型都是用CatBoost算法创建的模型。用于5年死亡率预测的CatBoost模型显示平均AUROC为0.801，预测10年死亡率的CatBoost模型产生的平均AUROC为0.814。

结论

本研究开发了能够准确预测脊柱脊索瘤患者5年至10年生存概率的ML模型。将这些可解释的、个性化的预后模型整合到一个网络应用程序中，可为给定患者提供定量的生存估计。局部可解释性使预测受影响的方式具有透明度。需要进一步的外部验证来支持其普遍性和临床实用性。