Li Dong-Lin, Zhang Lin, Yan Hao-Ji, Zheng Yin-Bin, Guo Xiao-Guang, Tang Sheng-Jie, Hu Hai-Yang, Yan Hang, Qin Chao, Zhang Jun, Guo Hai-Yang, Zhou Hai-Ning, Tian Dong
Department of Thoracic Surgery, Suining Central Hospital, Sunning, China.
Department of Thoracic Surgery, Affiliated Hospital of North Sichuan Medical College, Nanchong, China.
Front Oncol. 2022 Sep 8;12:986358. doi: 10.3389/fonc.2022.986358. eCollection 2022.
For patients with stage T1-T2 esophageal squamous cell carcinoma (ESCC), accurately predicting lymph node metastasis (LNM) remains challenging. We aimed to investigate the performance of machine learning (ML) models for predicting LNM in patients with stage T1-T2 ESCC.
Patients with T1-T2 ESCC at three centers between January 2014 and December 2019 were included in this retrospective study and divided into training and external test sets. All patients underwent esophagectomy and were pathologically examined to determine the LNM status. Thirty-six ML models were developed using six modeling algorithms and six feature selection techniques. The optimal model was determined by the bootstrap method. An external test set was used to further assess the model's generalizability and effectiveness. To evaluate prediction performance, the area under the receiver operating characteristic curve (AUC) was applied.
Of the 1097 included patients, 294 (26.8%) had LNM. The ML models based on clinical features showed good predictive performance for LNM status, with a median bootstrapped AUC of 0.659 (range: 0.592, 0.715). The optimal model using the naive Bayes algorithm with feature selection by determination coefficient had the highest AUC of 0.715 (95% CI: 0.671, 0.763). In the external test set, the optimal ML model achieved an AUC of 0.752 (95% CI: 0.674, 0.829), which was superior to that of T stage (0.624, 95% CI: 0.547, 0.701).
ML models provide good LNM prediction value for stage T1-T2 ESCC patients, and the naive Bayes algorithm with feature selection by determination coefficient performed best.
对于T1 - T2期食管鳞状细胞癌(ESCC)患者,准确预测淋巴结转移(LNM)仍然具有挑战性。我们旨在研究机器学习(ML)模型在预测T1 - T2期ESCC患者LNM方面的性能。
本回顾性研究纳入了2014年1月至2019年12月期间三个中心的T1 - T2期ESCC患者,并将其分为训练集和外部测试集。所有患者均接受了食管切除术,并进行了病理检查以确定LNM状态。使用六种建模算法和六种特征选择技术开发了36个ML模型。通过自助法确定最佳模型。使用外部测试集进一步评估模型的泛化能力和有效性。为了评估预测性能,应用了受试者工作特征曲线(AUC)下的面积。
在纳入的1097例患者中,294例(26.8%)发生了LNM。基于临床特征的ML模型对LNM状态显示出良好的预测性能,自助法AUC中位数为0.659(范围:0.592,0.715)。使用朴素贝叶斯算法并通过确定系数进行特征选择的最佳模型的AUC最高,为0.715(95%CI:0.671,0.763)。在外部测试集中,最佳ML模型的AUC为0.752(95%CI:0.674,0.829),优于T分期的AUC(0.624,95%CI:0.547,0.701)。
ML模型为T1 - T2期ESCC患者提供了良好的LNM预测价值,并且使用确定系数进行特征选择的朴素贝叶斯算法表现最佳。