Department of Computer Science, Universitat Politècnica de Catalunya, Barcelona, 08034, Spain.
Amalfi Analytics, Spain.
Comput Biol Med. 2022 Dec;151(Pt A):106188. doi: 10.1016/j.compbiomed.2022.106188. Epub 2022 Oct 12.
Accurate prediction of the mortality of post-liver transplantation is an important but challenging task. It relates to optimizing organ allocation and estimating the risk of possible dysfunction. Existing risk scoring models, such as the Balance of Risk (BAR) score and the Survival Outcomes Following Liver Transplantation (SOFT) score, do not predict the mortality of post-liver transplantation with sufficient accuracy. In this study, we evaluate the performance of machine learning models and establish an explainable machine learning model for predicting mortality in liver transplant recipients.
The optimal feature set for the prediction of the mortality was selected by a wrapper method based on binary particle swarm optimization (BPSO). With the selected optimal feature set, seven machine learning models were applied to predict mortality over different time windows. The best-performing model was used to predict mortality through a comprehensive comparison and evaluation. An interpretable approach based on machine learning and SHapley Additive exPlanations (SHAP) is used to explicitly explain the model's decision and make new discoveries.
With regard to predictive power, our results demonstrated that the feature set selected by BPSO outperformed both the feature set in the existing risk score model (BAR score, SOFT score) and the feature set processed by principal component analysis (PCA). The best-performing model, extreme gradient boosting (XGBoost), was found to improve the Area Under a Curve (AUC) values for mortality prediction by 6.7%, 11.6%, and 17.4% at 3 months, 3 years, and 10 years, respectively, compared to the SOFT score. The main predictors of mortality and their impact were discussed for different age groups and different follow-up periods.
Our analysis demonstrates that XGBoost can be an ideal method to assess the mortality risk in liver transplantation. In combination with the SHAP approach, the proposed framework provides a more intuitive and comprehensive interpretation of the predictive model, thereby allowing the clinician to better understand the decision-making process of the model and the impact of factors associated with mortality risk in liver transplantation.
准确预测肝移植后的死亡率是一项重要但具有挑战性的任务。它关系到优化器官分配和估计可能功能障碍的风险。现有的风险评分模型,如风险平衡(BAR)评分和肝移植后生存结果(SOFT)评分,不能足够准确地预测肝移植后的死亡率。在这项研究中,我们评估了机器学习模型的性能,并建立了一个可解释的机器学习模型来预测肝移植受者的死亡率。
基于二进制粒子群优化(BPSO)的包装方法选择预测死亡率的最佳特征集。利用所选的最佳特征集,应用七种机器学习模型在不同的时间窗口预测死亡率。通过综合比较和评估,选择表现最好的模型进行死亡率预测。基于机器学习和 Shapley 可加解释(SHAP)的可解释方法用于明确解释模型的决策并发现新的见解。
就预测能力而言,我们的结果表明,BPSO 选择的特征集优于现有风险评分模型(BAR 评分、SOFT 评分)的特征集和主成分分析(PCA)处理的特征集。表现最好的模型,极端梯度增强(XGBoost),与 SOFT 评分相比,在 3 个月、3 年和 10 年时,死亡率预测的曲线下面积(AUC)值分别提高了 6.7%、11.6%和 17.4%。讨论了不同年龄组和不同随访期的死亡率的主要预测因素及其影响。
我们的分析表明,XGBoost 可以成为评估肝移植死亡率的理想方法。与 SHAP 方法相结合,所提出的框架为预测模型提供了更直观和全面的解释,从而使临床医生能够更好地理解模型的决策过程以及与肝移植死亡率相关的因素的影响。