Willem Alexander Children's Hospital, Leiden University Medical Center, Leiden, The Netherlands.
Pediatric Hematology and Immunology Unit, Necker Hospital for Sick Children, Assistance Publique-Hopitaux de Paris, Paris, France.
Transplant Cell Ther. 2023 Dec;29(12):775.e1-775.e8. doi: 10.1016/j.jtct.2023.09.007. Epub 2023 Sep 13.
Allogeneic hematopoietic stem cell transplantation (HSCT) is a curative treatment for many inborn errors of immunity, metabolism, and hematopoiesis. No predictive models are available for these disorders. We created a machine learning model using XGBoost to predict survival after HSCT using European Society for Blood and Marrow Transplant registry data of 10,888 patients who underwent HSCT for inborn errors between 2006 and 2018, and compared it to a simple linear Cox model, an elastic net Cox model, and a random forest model. The XGBoost model had a cross-validated area under the curve value of .73 at 1 year, which was significantly superior to the other models, and it accurately predicted for countries excluded while training. It predicted close to 0% and >30% mortality more often than other models at 1 year, while maintaining good calibration. The 5-year survival was 94.7% in the 25% of patients at lowest risk and 62.3% in the 25% at highest risk. Within disease and donor subgroups, XGBoost outperformed the best univariate predictor. We visualized the effect of the main predictors-diagnosis, performance score, patient age and donor type-using the SHAP ML explainer and developed a stand-alone application, which can predict using the model and visualize predictions. The risk of mortality after HSCT for inborn errors can be accurately predicted using an explainable machine learning model. This exceeds the performance of models described in the literature. Doing so can help detect deviations from expected survival and improve risk stratification in trials.
异基因造血干细胞移植 (HSCT) 是许多先天性免疫、代谢和造血疾病的根治性治疗方法。目前尚无这些疾病的预测模型。我们使用 XGBoost 创建了一个机器学习模型,利用欧洲血液和骨髓移植学会 (EBMT) 注册中心 2006 年至 2018 年间接受 HSCT 治疗的 10888 例先天性疾病患者的数据,来预测 HSCT 后的生存率,并与简单线性 Cox 模型、弹性网 Cox 模型和随机森林模型进行了比较。XGBoost 模型的交叉验证曲线下面积(AUC)在 1 年时为 0.73,明显优于其他模型,并且能够准确预测训练中排除的国家。在 1 年时,它比其他模型更频繁地预测接近 0%和>30%的死亡率,同时保持良好的校准。在风险最低的 25%患者中,5 年生存率为 94.7%,而在风险最高的 25%患者中,5 年生存率为 62.3%。在疾病和供体亚组内,XGBoost 优于最佳单变量预测因子。我们使用 SHAP ML 解释器可视化了主要预测因子(诊断、表现评分、患者年龄和供体类型)的影响,并开发了一个独立的应用程序,该应用程序可以使用该模型进行预测并可视化预测结果。使用可解释的机器学习模型可以准确预测先天性疾病患者 HSCT 后的死亡率。这超过了文献中描述的模型的性能。这样做可以帮助发现与预期生存率的偏差,并改善试验中的风险分层。