Suppr超能文献

用于预测前列腺癌生化复发的可解释和可视化机器学习模型。

Explainable and visualizable machine learning models to predict biochemical recurrence of prostate cancer.

机构信息

Collaborative Innovation Centre of Regenerative Medicine and Medical BioResource Development and Application Co-Constructed By the Province and Ministry, Guangxi Medical University, No. 22, Shuangyong Road, Qingxiu District, Nanning City, 530021, Guangxi Zhuang Autonomous Region, People's Republic of China.

Center for Genomic and Personalized Medicine, Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, 530021, Guangxi, People's Republic of China.

出版信息

Clin Transl Oncol. 2024 Sep;26(9):2369-2379. doi: 10.1007/s12094-024-03480-x. Epub 2024 Apr 11.

Abstract

PURPOSE

Machine learning (ML) models presented an excellent performance in the prognosis prediction. However, the black box characteristic of ML models limited the clinical applications. Here, we aimed to establish explainable and visualizable ML models to predict biochemical recurrence (BCR) of prostate cancer (PCa).

MATERIALS AND METHODS

A total of 647 PCa patients were retrospectively evaluated. Clinical parameters were identified using LASSO regression. Then, cohort was split into training and validation datasets with a ratio of 0.75:0.25 and BCR-related features were included in Cox regression and five ML algorithm to construct BCR prediction models. The clinical utility of each model was evaluated by concordance index (C-index) values and decision curve analyses (DCA). Besides, Shapley Additive Explanation (SHAP) values were used to explain the features in the models.

RESULTS

We identified 11 BCR-related features using LASSO regression, then establishing five ML-based models, including random survival forest (RSF), survival support vector machine (SSVM), survival Tree (sTree), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and a Cox regression model, C-index were 0.846 (95%CI 0.796-0.894), 0.774 (95%CI 0.712-0.834), 0.757 (95%CI 0.694-0.818), 0.820 (95%CI 0.765-0.869), 0.793 (95%CI 0.735-0.852), and 0.807 (95%CI 0.753-0.858), respectively. The DCA showed that RSF model had significant advantages over all models. In interpretability of ML models, the SHAP value demonstrated the tangible contribution of each feature in RSF model.

CONCLUSIONS

Our score system provide reference for the identification for BCR, and the crafting of a framework for making therapeutic decisions for PCa on a personalized basis.

摘要

目的

机器学习(ML)模型在预后预测方面表现出色。然而,ML 模型的黑盒特性限制了其临床应用。本研究旨在建立可解释和可视化的 ML 模型来预测前列腺癌(PCa)的生化复发(BCR)。

材料与方法

回顾性评估了 647 例 PCa 患者。采用 LASSO 回归识别临床参数。然后,将队列按 0.75:0.25 的比例分为训练和验证数据集,并将 BCR 相关特征纳入 Cox 回归和五种 ML 算法构建 BCR 预测模型。通过一致性指数(C-index)值和决策曲线分析(DCA)评估每个模型的临床实用性。此外,还使用 Shapley 加性解释(SHAP)值来解释模型中的特征。

结果

我们通过 LASSO 回归确定了 11 个 BCR 相关特征,然后建立了五个基于 ML 的模型,包括随机生存森林(RSF)、生存支持向量机(SSVM)、生存树(sTree)、梯度提升决策树(GBDT)、极端梯度提升(XGBoost)和 Cox 回归模型,C-index 分别为 0.846(95%CI 0.796-0.894)、0.774(95%CI 0.712-0.834)、0.757(95%CI 0.694-0.818)、0.820(95%CI 0.765-0.869)、0.793(95%CI 0.735-0.852)和 0.807(95%CI 0.753-0.858)。DCA 显示 RSF 模型明显优于所有模型。在 ML 模型的可解释性方面,SHAP 值展示了 RSF 模型中每个特征的实际贡献。

结论

我们的评分系统为 BCR 的识别提供了参考,并为基于个性化的 PCa 治疗决策制定框架提供了参考。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验