Hao Ligang, Zhang Junjie, Di Yonghui, Qi Zheng, Zhang Peng
Department of Thoracic Surgery, Xingtai People's Hospital, Xingtai, Hebei, China.
Department of Computed Tomography and Magnetic Resonance, Xingtai People's Hospital, Xingtai, Hebei, China.
PLoS One. 2025 Apr 1;20(4):e0320674. doi: 10.1371/journal.pone.0320674. eCollection 2025.
Non-small-cell lung cancer (NSCLC) and its surgery significantly increase the venous thromboembolism (VTE) risk. This study explored the VTE risk factors and established a machine-learning model to predict a failure of postoperative thromboprophylaxis.
This retrospective study included patients with NSCLC who underwent surgery between January 2018 and November 2022. The patients were randomized 7:3 to the training and test sets. Nine machine learning models were constructed. The three most predictive machine-learning classifiers were chosen as the first layer of the stacking machine-learning model, and logistic regression was the second layer of the meta-learning model.
This study included 362 patients, including 58 (16.0%) with VTE. Based on the multivariable logistic regression analysis, age, platelets, D-dimers, albumin, smoking history, and epidermal growth factor receptor (EGFR) exon 21 mutation were used to develop the nine machine-learning models. LGBM Classifier, RandomForest Classifier, and GNB were chosen for the first layer of the stacking machine learning model. The area under the received operating characteristics curve (ROC-AUC), accuracy, sensitivity, and specificity of the stacking machine learning model in the training/test set were 0.984/0.979, 0.949/0.954, 0.935/1.000, and 0.958/0.887, respectively. In the validation set, the final stacking machine learning model demonstrated an ROC AUC of 0.983, accuracy of 0.937, sensitivity of 0.978, and specificity of 0.947. The decision curve analyses revealed high benefits.
The stacking machine learning model based on EGFR mutation and clinical characteristics had a predictive value for postoperative VTE in patients with NSCLC.
非小细胞肺癌(NSCLC)及其手术显著增加静脉血栓栓塞(VTE)风险。本研究探讨VTE危险因素并建立机器学习模型以预测术后血栓预防失败。
这项回顾性研究纳入了2018年1月至2022年11月期间接受手术的NSCLC患者。患者按7:3随机分为训练集和测试集。构建了9个机器学习模型。选择三个预测性最强的机器学习分类器作为堆叠机器学习模型的第一层,逻辑回归作为元学习模型的第二层。
本研究纳入362例患者,其中58例(16.0%)发生VTE。基于多变量逻辑回归分析,使用年龄、血小板、D-二聚体、白蛋白、吸烟史和表皮生长因子受体(EGFR)外显子21突变来开发9个机器学习模型。选择LightGBM分类器、随机森林分类器和高斯朴素贝叶斯作为堆叠机器学习模型的第一层。堆叠机器学习模型在训练集/测试集中的受试者操作特征曲线下面积(ROC-AUC)、准确率、敏感性和特异性分别为0.984/0.979、0.949/0.954、0.935/1.000和0.958/0.887。在验证集中,最终的堆叠机器学习模型显示ROC AUC为0.983,准确率为0.937,敏感性为0.978,特异性为0.947。决策曲线分析显示效益较高。
基于EGFR突变和临床特征的堆叠机器学习模型对NSCLC患者术后VTE具有预测价值。