Zou Xin Chang, Rao Xue Peng, Huang Jian Biao, Zhou Jie, Chao Hai Chao, Zeng Tao
The Second Affiliated Hospital, Jiangxi Medical College, Nanchang University, Nanchang, China.
Department of Urology, Second Affiliated Hospital of Nanchang University, Nanchang, China.
Front Oncol. 2024 Dec 13;14:1477166. doi: 10.3389/fonc.2024.1477166. eCollection 2024.
Distant metastasis in bladder cancer is linked to poor prognosis and significant mortality. Machine learning (ML), a key area of artificial intelligence, has shown promise in the diagnosis, staging, and treatment of bladder cancer. This study aimed to employ various ML techniques to predict distant metastasis in patients with bladder cancer.
Patients diagnosed with bladder cancer in the Surveillance, Epidemiology, and End Results (SEER) database from 2000 to 2021 were included in this study. After a rigorous screening process, a total of 4,108 patients were selected for further analysis, divided in a 7:3 ratio into a training cohort and an internal validation cohort. In addition, 118 patients treated at the Second Affiliated Hospital of Nanchang University were included as an external validation cohort. Features were filtered using the least absolute shrinkage and selection operator (LASSO) regression algorithm. Based on the significant features identified, three ML algorithms were utilized to develop prediction models: logistic regression, support vector machine (SVM), and linear discriminant analysis (LDA). The predictive performance of the three models was evaluated by obtaining the area under the receiver operating characteristic (ROC) curve (AUC), the precision, the accuracy, and the F1 score.
According to the statistical results, the final probability of distant metastasis in the population was 12.0% ( = 495). LASSO regression analysis revealed that age, chemotherapy, tumor size, the examination of non-regional lymph nodes, and regional lymph node evaluation were significantly associated with distant metastasis of bladder cancer. In the internal validation cohort, the prediction accuracy rates for logistic regression, SVM, and LDA were 0.874, 0.877, and 0.845, respectively. The precision rates were 0.805, 0.769, and 0.827, respectively, and the F1 scores were 0.821, 0.819, and 0.835, respectively. The ROC curve demonstrated that the AUC for all models was greater than 0.7. In the external validation cohort, the prediction accuracy rates for logistic regression, SVM, and LDA were 0.856, 0.848, and 0.797, respectively, with the ROC curve indicating that the AUC also exceeded 0.7. The precision rates were 0.877, 0.718, and 0.736, respectively, and the F1 scores were 0.797, 0.778, and 0.762, respectively. Among the algorithms used, logistic regression demonstrated better predictive efficiency than the other two methods. The top three variables with the highest importance scores in the logistic regression were non-regional lymph nodes, age, and chemotherapy.
The prediction model developed using three ML algorithms demonstrated strong accuracy and discriminative capability in predicting distant metastasis in patients with bladder cancer. This might help clinicians in understanding patient prognosis and in formulating personalized treatment strategies, ultimately improving the overall prognosis of patients with bladder cancer.
膀胱癌的远处转移与预后不良及高死亡率相关。机器学习(ML)作为人工智能的关键领域,在膀胱癌的诊断、分期及治疗中展现出应用前景。本研究旨在运用多种ML技术预测膀胱癌患者的远处转移情况。
本研究纳入了2000年至2021年监测、流行病学和最终结果(SEER)数据库中诊断为膀胱癌的患者。经过严格筛选流程,共选取4108例患者进行进一步分析,按7:3比例分为训练队列和内部验证队列。此外,纳入南昌大学第二附属医院治疗的118例患者作为外部验证队列。使用最小绝对收缩和选择算子(LASSO)回归算法筛选特征。基于识别出的显著特征,运用三种ML算法构建预测模型:逻辑回归、支持向量机(SVM)和线性判别分析(LDA)。通过获取受试者工作特征(ROC)曲线下面积(AUC)、精确率、准确率和F1分数评估三种模型的预测性能。
根据统计结果,总体人群中远处转移的最终概率为12.0%(n = 495)。LASSO回归分析显示,年龄、化疗、肿瘤大小、非区域淋巴结检查及区域淋巴结评估与膀胱癌远处转移显著相关。在内部验证队列中,逻辑回归、SVM和LDA的预测准确率分别为0.874、0.877和0.845。精确率分别为0.805、0.769和0.827,F1分数分别为0.821、0.819和0.835。ROC曲线表明所有模型的AUC均大于0.7。在外部验证队列中,逻辑回归、SVM和LDA的预测准确率分别为0.856、0.848和0.797,ROC曲线显示AUC也超过0.7。精确率分别为0.877、0.718和0.736,F1分数分别为0.797、0.778和0.762。在所使用的算法中,逻辑回归的预测效率优于其他两种方法。逻辑回归中重要性得分最高的前三个变量为非区域淋巴结、年龄和化疗。
使用三种ML算法构建的预测模型在预测膀胱癌患者远处转移方面具有较高的准确性和判别能力。这可能有助于临床医生了解患者预后并制定个性化治疗策略,最终改善膀胱癌患者的总体预后。