Liu Yuan, Song Chen, Tian Zhiqiang, Shen Wei
Department of General Surgery, The Affiliated Wuxi People's Hospital of Nanjing Medical University, Wuxi, People's Republic of China.
Int J Gen Med. 2023 May 18;16:1909-1925. doi: 10.2147/IJGM.S408770. eCollection 2023.
This study aims to construct a machine learning model that can recognize preoperative, intraoperative, and postoperative high-risk indicators and predict the onset of venous thromboembolism (VTE) in patients.
A total of 1239 patients diagnosed with gastric cancer were enrolled in this retrospective study, among whom 107 patients developed VTE after surgery. We collected 42 characteristic variables of gastric cancer patients from the database of Wuxi People's Hospital and Wuxi Second People's Hospital between 2010 and 2020, including patients' demographic characteristics, chronic medical history, laboratory test characteristics, surgical information, and patients' postoperative conditions. Four machine learning algorithms, namely, extreme gradient boosting (XGBoost), random forest (RF), support vector machine (SVM), and k-nearest neighbor (KNN), were employed to develop predictive models. We also utilized Shapley additive explanation (SHAP) for model interpretation and evaluated the models using k-fold cross-validation, receiver operating characteristic (ROC) curves, calibration curves, decision curve analysis (DCA), and external validation metrics.
The XGBoost algorithm demonstrated superior performance compared to the other three prediction models. The area under the curve (AUC) value for XGBoost was 0.989 in the training set and 0.912 in the validation set, indicating high prediction accuracy. Furthermore, the AUC value of the external validation set was 0.85, signifying good extrapolation of the XGBoost prediction model. The results of SHAP analysis revealed that several factors, including higher body mass index (BMI), history of adjuvant radiotherapy and chemotherapy, T-stage of the tumor, lymph node metastasis, central venous catheter use, high intraoperative bleeding, and long operative time, were significantly associated with postoperative VTE.
The machine learning algorithm XGBoost derived from this study enables the development of a predictive model for postoperative VTE in patients after radical gastrectomy, thereby assisting clinicians in making informed clinical decisions.
本研究旨在构建一种机器学习模型,该模型能够识别术前、术中和术后的高危指标,并预测患者静脉血栓栓塞症(VTE)的发生。
本回顾性研究共纳入1239例诊断为胃癌的患者,其中107例患者术后发生VTE。我们从无锡市人民医院和无锡市第二人民医院2010年至2020年的数据库中收集了胃癌患者的42个特征变量,包括患者的人口统计学特征、慢性病史、实验室检查特征、手术信息以及患者术后情况。采用四种机器学习算法,即极端梯度提升(XGBoost)、随机森林(RF)、支持向量机(SVM)和k近邻(KNN)来开发预测模型。我们还利用Shapley值法(SHAP)进行模型解释,并使用k折交叉验证、受试者工作特征(ROC)曲线、校准曲线、决策曲线分析(DCA)和外部验证指标对模型进行评估。
与其他三种预测模型相比,XGBoost算法表现出更优的性能。XGBoost在训练集中的曲线下面积(AUC)值为0.989,在验证集中为0.912,表明预测准确性高。此外,外部验证集的AUC值为0.85,表明XGBoost预测模型具有良好的外推性。SHAP分析结果显示,包括较高的体重指数(BMI)、辅助放化疗史、肿瘤T分期、淋巴结转移、中心静脉导管使用、术中出血多和手术时间长等几个因素与术后VTE显著相关。
本研究中得出的机器学习算法XGBoost能够开发出根治性胃切除术后患者术后VTE的预测模型,从而帮助临床医生做出明智的临床决策。