Suppr超能文献

基于机器学习算法构建结直肠癌患者术后深静脉血栓形成风险预测模型。

Construction of a risk prediction model for postoperative deep vein thrombosis in colorectal cancer patients based on machine learning algorithms.

作者信息

Liu Xin, Shu Xingming, Zhou Yejiang, Jiang Yifan

机构信息

Department of Clinical Medicine, Southwest Medical University, Luzhou, China.

Department of Gastrointestinal Surgery, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, China.

出版信息

Front Oncol. 2024 Nov 27;14:1499794. doi: 10.3389/fonc.2024.1499794. eCollection 2024.

Abstract

BACKGROUND

Colorectal cancer is a prevalent malignancy of the digestive system, with an increasing incidence. Lower extremity deep vein thrombosis (DVT) is a frequent postoperative complication, occurring in up to 40% of cases.

OBJECTIVE

This research aims to develop and validate a machine learning model (ML) to predict the risk of lower limb deep vein thrombosis in patients with colorectal cancer, facilitating preventive and therapeutic measures to enhance recovery and ensure safety.

METHODS

In this retrospective cohort study, we collected data from 429 colorectal cancer patients from January 2021 to January 2024. The medical records included age, blood test results, body mass index, underlying diseases, clinical staging, histological typing, surgical methods, and postoperative complications. We employed the Synthetic Minority Oversampling Technique to address imbalanced data and split the dataset into training and validation sets in a 7:3 ratio. Feature selection was performed using Random Forest (RF), XGBoost, and Least Absolute Shrinkage and Selection Operator algorithms (LASSO). We then trained six machine learning models: Logistic Regression (LR), Naive Bayes (NB), Gaussian Process (GP), Random Forest, XGBoost, and Multilayer Perceptron (MLP). The model's performance was evaluated using metrics such as area under the Receiver Operating Characteristic curve, accuracy, sensitivity, specificity, F1 score, and confusion matrix. Additionally, SHAP and LIME were used to enhance the interpretability of the results.

RESULTS

The study combined Random Forest, XGBoost algorithms, and LASSO regression with univariate regression analysis to identify significant predictive factors, including age, preoperative prealbumin, preoperative albumin, preoperative hemoglobin, operation time, PIKVA2, CEA, and preoperative neutrophil count. The XGBoost model outperformed other ML algorithms, achieving an AUC of 0.996, an accuracy of 0.9636, a specificity of 0.9778, and an F1 score of 0.9576. Moreover, the SHAP method identified age and preoperative prealbumin as the primary determinants influencing ML model predictions. Finally, the study employed LIME for more precise prediction and interpretation of individual predictions.

CONCLUSION

The machine learning algorithms effectively predicted postoperative lower limb deep vein thrombosis in colorectal cancer patients. The XGBoost model demonstrated strong potential for improving early detection and treatment in clinical settings.

摘要

背景

结直肠癌是消化系统常见的恶性肿瘤,发病率呈上升趋势。下肢深静脉血栓形成(DVT)是常见的术后并发症,发生率高达40%。

目的

本研究旨在开发并验证一种机器学习模型(ML),以预测结直肠癌患者下肢深静脉血栓形成的风险,促进采取预防和治疗措施,以加快康复并确保安全。

方法

在这项回顾性队列研究中,我们收集了2021年1月至2024年1月期间429例结直肠癌患者的数据。病历包括年龄、血液检查结果、体重指数、基础疾病、临床分期、组织学类型、手术方法和术后并发症。我们采用合成少数过采样技术来处理数据不平衡问题,并将数据集按7:3的比例分为训练集和验证集。使用随机森林(RF)、XGBoost和最小绝对收缩和选择算子算法(LASSO)进行特征选择。然后我们训练了六种机器学习模型:逻辑回归(LR)、朴素贝叶斯(NB)、高斯过程(GP)、随机森林、XGBoost和多层感知器(MLP)。使用受试者工作特征曲线下面积、准确率、敏感性、特异性、F1分数和混淆矩阵等指标评估模型的性能。此外,使用SHAP和LIME来增强结果的可解释性。

结果

该研究将随机森林、XGBoost算法和LASSO回归与单变量回归分析相结合,以确定显著的预测因素,包括年龄、术前前白蛋白、术前白蛋白、术前血红蛋白、手术时间、PIKVA2、癌胚抗原(CEA)和术前中性粒细胞计数。XGBoost模型优于其他ML算法,曲线下面积(AUC)为0.996,准确率为0.9636,特异性为0.9778,F1分数为0.9576。此外,SHAP方法确定年龄和术前前白蛋白是影响ML模型预测的主要决定因素。最后,该研究使用LIME对个体预测进行更精确的预测和解释。

结论

机器学习算法有效地预测了结直肠癌患者术后下肢深静脉血栓形成。XGBoost模型在临床环境中改善早期检测和治疗方面显示出强大的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/29e8/11631706/db4dbd19ad70/fonc-14-1499794-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验