Li Yonggang, Hui Shuan
Department of Surgery of Glandular Vascular Abdominal Wall, The First People's Hospital of Xianyang Xianyang 712000, Shaanxi, China.
Am J Transl Res. 2025 Apr 15;17(4):2614-2628. doi: 10.62347/CZYA6232. eCollection 2025.
To develop and validate a predictive tool using machine learning models for identifying risk factors for upper limb dysfunction following modified radical mastectomy (MRM) in breast cancer patients.
A total of 768 breast cancer patients who underwent Modified radical mastectomy (MRM) between January 2022 and December 2023 were included in this study. The dataset was divided into a training set (506 cases) and a validation set (262 cases). The collected data encompassed demographic characteristics, clinicopathological features, medical history, and postoperative rehabilitation plans. Predictive analyses were conducted using machine learning models, including support vector machine (SVM), extreme gradient boosting (XGBOOST), Gaussian naïve Bayes (GNB), adaptive boosting (ADABOOST), and random forest. Model evaluation was performed using ten-fold cross-validation, with performance metrics including receiver operating characteristic (ROC) curves, area under the curve (AUC) values, specificity, sensitivity, accuracy, and F1-score. DeLong's test was used to compare AUC values and identify the optimal predictive model.
Baseline characteristics showed no significant differences between the training and validation sets (P>0.05). Analysis of factors associated with upper limb dysfunction in the training set revealed significant differences in variables such as age, BMI, cancer type, axillary lymph node dissection, ipsilateral radiotherapy, postoperative rehabilitation plans, and monthly per capita household income (P<0.05). Low correlations were observed among these variables (R values close to 0), indicating minimal multicollinearity. Model performance evaluation showed that the XGBOOST and random forest models demonstrated high AUC values (0.817-0.884) across both the training and validation sets. These models also exhibited superior specificity and sensitivity, indicating strong predictive performance and robustness in identifying patients at risk of postoperative upper limb dysfunction.
The XGBOOST and random forest models exhibited excellent predictive accuracy, offering valuable tools for the early identification and personalized management of high-risk patients. These models provide critical data support for postoperative rehabilitation planning and contribute to improving the quality of life for breast cancer patients.
开发并验证一种使用机器学习模型的预测工具,用于识别乳腺癌患者改良根治性乳房切除术(MRM)后上肢功能障碍的风险因素。
本研究纳入了2022年1月至2023年12月期间接受改良根治性乳房切除术(MRM)的768例乳腺癌患者。数据集分为训练集(506例)和验证集(262例)。收集的数据包括人口统计学特征、临床病理特征、病史和术后康复计划。使用机器学习模型进行预测分析,包括支持向量机(SVM)、极端梯度提升(XGBOOST)、高斯朴素贝叶斯(GNB)、自适应提升(ADABOOST)和随机森林。使用十折交叉验证进行模型评估,性能指标包括受试者工作特征(ROC)曲线、曲线下面积(AUC)值、特异性、敏感性、准确性和F1分数。使用德龙检验比较AUC值并确定最佳预测模型。
基线特征显示训练集和验证集之间无显著差异(P>0.05)。对训练集中与上肢功能障碍相关因素的分析显示,年龄、BMI、癌症类型、腋窝淋巴结清扫、同侧放疗、术后康复计划和月人均家庭收入等变量存在显著差异(P<0.05)。这些变量之间的相关性较低(R值接近0),表明多重共线性最小。模型性能评估表明,XGBOOST和随机森林模型在训练集和验证集中均表现出较高的AUC值(0.817 - 0.884)。这些模型还表现出卓越的特异性和敏感性,表明在识别术后上肢功能障碍风险患者方面具有强大的预测性能和稳健性。
XGBOOST和随机森林模型表现出优异的预测准确性,为高危患者的早期识别和个性化管理提供了有价值的工具。这些模型为术后康复计划提供了关键数据支持,并有助于提高乳腺癌患者的生活质量。