Department of General Surgery, Yan 'an People's Hospital, Yan 'an, China.
Department of Cardiac Surgery, Fuwai Hospital Chinese Academy of Medical Sciences, Shenzhen, China.
BMC Gastroenterol. 2024 Apr 19;24(1):137. doi: 10.1186/s12876-024-03223-w.
Prediction of lymph node metastasis (LNM) for intrahepatic cholangiocarcinoma (ICC) is critical for the treatment regimen and prognosis. We aim to develop and validate machine learning (ML)-based predictive models for LNM in patients with ICC.
A total of 345 patients with clinicopathological characteristics confirmed ICC from Jan 2007 to Jan 2019 were enrolled. The predictors of LNM were identified by the least absolute shrinkage and selection operator (LASSO) and logistic analysis. The selected variables were used for developing prediction models for LNM by six ML algorithms, including Logistic regression (LR), Gradient boosting machine (GBM), Extreme gradient boosting (XGB), Random Forest (RF), Decision tree (DT), Multilayer perceptron (MLP). We applied 10-fold cross validation as internal validation and calculated the average of the areas under the receiver operating characteristic (ROC) curve to measure the performance of all models. A feature selection approach was applied to identify importance of predictors in each model. The heat map was used to investigate the correlation of features. Finally, we established a web calculator using the best-performing model.
In multivariate logistic regression analysis, factors including alcoholic liver disease (ALD), smoking, boundary, diameter, and white blood cell (WBC) were identified as independent predictors for LNM in patients with ICC. In internal validation, the average values of AUC of six models ranged from 0.820 to 0.908. The XGB model was identified as the best model, the average AUC was 0.908. Finally, we established a web calculator by XGB model, which was useful for clinicians to calculate the likelihood of LNM.
The proposed ML-based predicted models had a good performance to predict LNM of patients with ICC. XGB performed best. A web calculator based on the ML algorithm showed promise in assisting clinicians to predict LNM and developed individualized medical plans.
预测肝内胆管癌(ICC)的淋巴结转移(LNM)对于治疗方案和预后至关重要。本研究旨在开发和验证基于机器学习(ML)的 ICC 患者 LNM 预测模型。
共纳入 2007 年 1 月至 2019 年 1 月期间经临床病理特征证实的 345 例 ICC 患者。采用最小绝对收缩和选择算子(LASSO)和逻辑分析确定 LNM 的预测因子。选择变量采用 6 种 ML 算法(Logistic 回归(LR)、梯度提升机(GBM)、极端梯度提升(XGB)、随机森林(RF)、决策树(DT)、多层感知机(MLP))建立 LNM 预测模型。采用 10 折交叉验证作为内部验证,并计算受试者工作特征(ROC)曲线下面积的平均值来衡量所有模型的性能。应用特征选择方法确定每个模型中预测因子的重要性。热图用于研究特征之间的相关性。最后,我们使用表现最佳的模型建立了一个网络计算器。
多因素逻辑回归分析显示,ALD、吸烟、边界、直径和白细胞(WBC)是 ICC 患者 LNM 的独立预测因子。在内部验证中,六种模型的 AUC 平均值范围为 0.820 至 0.908。XGB 模型被确定为最佳模型,平均 AUC 为 0.908。最后,我们通过 XGB 模型建立了一个网络计算器,有助于临床医生计算 LNM 的可能性。
提出的基于 ML 的预测模型对预测 ICC 患者的 LNM 具有良好的性能。XGB 表现最佳。基于 ML 算法的网络计算器在辅助临床医生预测 LNM 和制定个体化医疗计划方面具有广阔的应用前景。