Peng Xixi, Lu Ziyue
Department of Breast Surgery, Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and Technology, Hubei Provincial Clinical Research Center for Breast Cancer, Wuhan Clinical Research Center for Breast Cancer, Wuhan, Hubei, 430079, People's Republic of China.
Department of Thoracic and Bone-soft Tissue Surgery, Hubei Cancer Hospital, Tongji Medical College, HuaZhong University of Science and Technology, Wuhan, Hubei, 430079, People's Republic of China.
Int J Gen Med. 2024 Sep 2;17:3799-3812. doi: 10.2147/IJGM.S478573. eCollection 2024.
Upper limb lymphedema is one of the most common adverse events related to surgery owing to the large gap between guideline implementation and the intended clinical outcomes. However, the monitoring of limb lymphedema remains challenging because of vague clinical presentations. This study aimed to develop and validate practical predictive models for upper limb lymphedema through machine learning.
We retrospectively collected clinical data to develop models for early risk prediction of upper limb lymphedema based on a single-center electronic health record data from patients who underwent breast cancer surgery from June 2021 through June 2023. For prediction model building, 70% and 30% of the data were randomly split into training and testing sets, respectively. We then developed an upper limb lymphedema prediction model using machine learning algorithms, which included random forest model (RFM), generalized logistic regression model (GLRM), and artificial neural network model (ANNM). For evaluating the model's performance, we used the area under the receiver operating characteristic curve (AUROC), calibration curve to compare different models. The potential clinical usefulness of the best model at the best threshold was assessed through a net benefit approach using a decision curve analysis (DCA).
Of the 3201 patients screened for eligibility, 3160 participants were recruited for the prediction model. Among these, Body Mass Index (BMI), hypertension, TNM, lesion site, level of lymph node dissection(LNMD), treatment, and nurse were independent risk factors for upper limb lymphedema and were listed as candidate variables of ML-based prediction models. The RFM algorithm, in combination with seven candidate variables, demonstrated the highest prediction efficiency in both the training and internal verification sets, with an area under the curve (AUC) of 0.894 and 0.889 and a 95% confidence interval (CI) of 0.839-0.949 and 0.834-0.944, respectively. The other two types of prediction models had prediction efficiencies between AUCs of 0.731 and 0.819 and 95% CIs of 0.674-0.789 and 0.762-0.876, respectively.
The interpretable predictive model helps physicians more accurately predict the upper limb lymphedema risk in patients undergoing breast cancer surgery. Especially for the RFM, this newly established machine learning-based model has shown good predictive ability for distinguishing high risk of upper limb lymphedema, which could facilitate future clinical decisions, hospital management, and improve outcomes.
由于指南实施与预期临床结果之间存在较大差距,上肢淋巴水肿是与手术相关的最常见不良事件之一。然而,由于临床表现模糊,肢体淋巴水肿的监测仍然具有挑战性。本研究旨在通过机器学习开发并验证上肢淋巴水肿的实用预测模型。
我们回顾性收集临床数据,以基于2021年6月至2023年6月接受乳腺癌手术患者的单中心电子健康记录数据,开发上肢淋巴水肿早期风险预测模型。对于预测模型构建,分别将70%和30%的数据随机分为训练集和测试集。然后,我们使用机器学习算法开发了上肢淋巴水肿预测模型,包括随机森林模型(RFM)、广义逻辑回归模型(GLRM)和人工神经网络模型(ANNM)。为了评估模型性能,我们使用受试者操作特征曲线下面积(AUROC)、校准曲线来比较不同模型。通过使用决策曲线分析(DCA)的净效益方法评估最佳模型在最佳阈值下的潜在临床实用性。
在3201名筛查合格的患者中,3160名参与者被纳入预测模型。其中,体重指数(BMI)、高血压、TNM、病变部位、淋巴结清扫水平(LNMD)、治疗和护理是上肢淋巴水肿的独立危险因素,并被列为基于机器学习的预测模型的候选变量。RFM算法结合七个候选变量,在训练集和内部验证集中均表现出最高的预测效率,曲线下面积(AUC)分别为0.894和0.889,95%置信区间(CI)分别为0.839 - 0.949和0.834 - 0.944。其他两种类型的预测模型的预测效率在AUC为0.731至0.819之间,95% CI分别为0.674 - 0.789和0.762 - 0.876。
可解释的预测模型有助于医生更准确地预测乳腺癌手术患者的上肢淋巴水肿风险。特别是对于RFM,这个新建立的基于机器学习的模型在区分上肢淋巴水肿高风险方面显示出良好的预测能力,这有助于未来的临床决策、医院管理并改善治疗结果。