Yao Fan, Miao Jianliang, Quan Bing, Li Jinghuan, Tang Bei, Lu Shenxin, Yin Xin
Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai, People's Republic of China.
National Clinical Research Center for Interventional Medicine, Shanghai, People's Republic of China.
J Hepatocell Carcinoma. 2025 May 31;12:1111-1128. doi: 10.2147/JHC.S523806. eCollection 2025.
To establish prediction models using Shapley Additive exPlanations (SHAP) and multiple machine learning (ML) algorithms to identify clinical features influencing hepatic arterial infusion chemotherapy (HAIC) resistance and survival in patients with hepatocellular carcinoma (HCC).
We recruited 286 patients with unresectable HCC who underwent HAIC. Patients were divided into training and validation datasets (7:3 ratio). eXtreme Gradient Boosting (XGBoost) was used to build the preliminary resistance prediction model. The SHAP values explained the importance of the clinical features. Recursive Feature Elimination with Cross-Validation (RFECV) was used to select the optimum number of features. Seven ML methods were used to construct further resistance prediction models, and ten ML algorithms were employed to establish the survival prognosis models.
The areas under the curve (AUC) of the XGBoost model were 1.000 and 0.812 for the training and validation groups, respectively. SHAP identified 27 of the 38 clinical features affecting resistance, with pre-HAIC treatment being the main factor. RFECV showed the best model performance with six features (pre-HAIC treatment, tumor size, HBV DNA, alkaline phosphatase (AKP), prothrombin time (PT), and portal vein tumor thrombosis (PVTT)). Random Forest had the best performance among the seven ML algorithms (AUC=0.935 for training, AUC=0.876 for validation). The combination of Stepcox [forward] and Gradient Boosting Machine was the best for predicting survival (AUC=0.98 in training, AUC=0.83 in validation). Based on the above clinical characteristics, patients were categorized into high-risk and low-risk groups based on the median risk score, and it was found that these characteristics also performed well in the prognostic model for predicting the survival of patients with HCC.
Pre-HAIC treatment, tumor size, HBV DNA, AKP, PT, and PVTT are effective predictors of post-HAIC resistance and survival in patients with unresectable advanced HCC.
使用夏普利值加法解释(SHAP)和多种机器学习(ML)算法建立预测模型,以识别影响肝细胞癌(HCC)患者肝动脉灌注化疗(HAIC)耐药性和生存的临床特征。
我们招募了286例行HAIC的不可切除HCC患者。患者被分为训练集和验证集(比例为7:3)。使用极端梯度提升(XGBoost)构建初步的耐药性预测模型。SHAP值解释了临床特征的重要性。使用带交叉验证的递归特征消除(RFECV)来选择最佳特征数量。使用七种ML方法构建进一步的耐药性预测模型,并使用十种ML算法建立生存预后模型。
XGBoost模型在训练组和验证组的曲线下面积(AUC)分别为1.000和0.812。SHAP识别出38个影响耐药性的临床特征中的27个,HAIC前治疗是主要因素。RFECV显示具有六个特征(HAIC前治疗、肿瘤大小、HBV DNA、碱性磷酸酶(AKP)、凝血酶原时间(PT)和门静脉肿瘤血栓形成(PVTT))的模型性能最佳。随机森林在七种ML算法中表现最佳(训练集AUC = 0.935,验证集AUC = 0.876)。Stepcox[向前]和梯度提升机的组合在预测生存方面表现最佳(训练集AUC = 0.98,验证集AUC = 0.83)。基于上述临床特征,根据中位风险评分将患者分为高风险和低风险组,发现这些特征在预测HCC患者生存的预后模型中也表现良好。
HAIC前治疗、肿瘤大小、HBV DNA、AKP、PT和PVTT是不可切除晚期HCC患者HAIC后耐药性和生存的有效预测指标。