Tian HuaKai, Ning ZhiKun, Zong Zhen, Liu Jiang, Hu CeGui, Ying HouQun, Li Hui
Department of General Surgery, First Affiliated Hospital of Nanchang University, Nanchang, China.
Department of Gastrointestinal Surgery, Second Affiliated Hospital of Nanchang University, Nanchang, China.
Front Med (Lausanne). 2022 Jan 18;8:759013. doi: 10.3389/fmed.2021.759013. eCollection 2021.
This study aimed to establish the best early gastric cancer lymph node metastasis (LNM) prediction model through machine learning (ML) to better guide clinical diagnosis and treatment decisions.
We screened gastric cancer patients with T1a and T1b stages from 2010 to 2015 in the Surveillance, Epidemiology and End Results (SEER) database and collected the clinicopathological data of patients with early gastric cancer who were treated with surgery at the Second Affiliated Hospital of Nanchang University from January 2014 to December 2016. At the same time, we applied 7 ML algorithms-the generalized linear model (GLM), RPART, random forest (RF), gradient boosting machine (GBM), support vector machine (SVM), regularized dual averaging (RDA), and the neural network (NNET)-and combined them with patient pathological information to develop the best prediction model for early gastric cancer lymph node metastasis. Among the SEER set, 80% were randomly selected to train the models, while the remaining 20% were used for testing. The data from the Second Affiliated Hospital were considered as the external verification set. Finally, we used the AUROC, F1-score value, sensitivity, and specificity to evaluate the performance of the model.
The tumour size, tumour grade, and depth of tumour invasion were independent risk factors for early gastric cancer LNM. Comprehensive comparison of the prediction model performance of the training set and test set showed that the RDA model had the best prediction performance (F1-score = 0.773; AUROC = 0.742). The AUROC of the external validation set was 0.73.
Tumour size, tumour grade, and depth of tumour invasion were independent risk factors for early gastric cancer LNM. ML predicted LNM risk more accurately, and the RDA model had the best predictive performance and could better guide clinical diagnosis and treatment decisions.
本研究旨在通过机器学习建立最佳的早期胃癌淋巴结转移(LNM)预测模型,以更好地指导临床诊断和治疗决策。
我们在监测、流行病学和最终结果(SEER)数据库中筛选了2010年至2015年T1a和T1b期的胃癌患者,并收集了2014年1月至2016年12月在南昌大学第二附属医院接受手术治疗的早期胃癌患者的临床病理数据。同时,我们应用了7种机器学习算法——广义线性模型(GLM)、RPART、随机森林(RF)、梯度提升机(GBM)、支持向量机(SVM)、正则化对偶平均(RDA)和神经网络(NNET)——并将它们与患者病理信息相结合,以开发早期胃癌淋巴结转移的最佳预测模型。在SEER数据集里,随机选择80%用于训练模型,其余20%用于测试。来自南昌大学第二附属医院的数据被视为外部验证集。最后,我们使用曲线下面积(AUROC)、F1分数值、敏感性和特异性来评估模型的性能。
肿瘤大小、肿瘤分级和肿瘤浸润深度是早期胃癌LNM的独立危险因素。对训练集和测试集预测模型性能的综合比较表明,RDA模型具有最佳的预测性能(F1分数=0.773;AUROC=0.742)。外部验证集的AUROC为0.73。
肿瘤大小、肿瘤分级和肿瘤浸润深度是早期胃癌LNM的独立危险因素。机器学习能更准确地预测LNM风险,且RDA模型具有最佳的预测性能,能更好地指导临床诊断和治疗决策。