基于机器学习的非小细胞肺癌骨转移预测
Prediction of bone metastasis in non-small cell lung cancer based on machine learning.
作者信息
Li Meng-Pan, Liu Wen-Cai, Sun Bo-Lin, Zhong Nan-Shan, Liu Zhi-Li, Huang Shan-Hu, Zhang Zhi-Hong, Liu Jia-Ming
机构信息
Department of Orthopedic Surgery, The First Affiliated Hospital of Nanchang University, Nanchang, China.
The First Clinical Medical College of Nanchang University, Nanchang, China.
出版信息
Front Oncol. 2023 Jan 9;12:1054300. doi: 10.3389/fonc.2022.1054300. eCollection 2022.
OBJECTIVE
The purpose of this paper was to develop a machine learning algorithm with good performance in predicting bone metastasis (BM) in non-small cell lung cancer (NSCLC) and establish a simple web predictor based on the algorithm.
METHODS
Patients who diagnosed with NSCLC between 2010 and 2018 in the Surveillance, Epidemiology and End Results (SEER) database were involved. To increase the extensibility of the research, data of patients who first diagnosed with NSCLC at the First Affiliated Hospital of Nanchang University between January 2007 and December 2016 were also included in this study. Independent risk factors for BM in NSCLC were screened by univariate and multivariate logistic regression. At this basis, we chose six commonly machine learning algorithms to build predictive models, including Logistic Regression (LR), Decision tree (DT), Random Forest (RF), Gradient Boosting Machine (GBM), Naive Bayes classifiers (NBC) and eXtreme gradient boosting (XGB). Then, the best model was identified to build the web-predictor for predicting BM of NSCLC patients. Finally, area under receiver operating characteristic curve (AUC), accuracy, sensitivity and specificity were used to evaluate the performance of these models.
RESULTS
A total of 50581 NSCLC patients were included in this study, and 5087(10.06%) of them developed BM. The sex, grade, laterality, histology, T stage, N stage, and chemotherapy were independent risk factors for NSCLC. Of these six models, the machine learning model built by the XGB algorithm performed best in both internal and external data setting validation, with AUC scores of 0.808 and 0.841, respectively. Then, the XGB algorithm was used to build a web predictor of BM from NSCLC.
CONCLUSION
This study developed a web predictor based XGB algorithm for predicting the risk of BM in NSCLC patients, which may assist doctors for clinical decision making.
目的
本文旨在开发一种在预测非小细胞肺癌(NSCLC)骨转移(BM)方面具有良好性能的机器学习算法,并基于该算法建立一个简单的网络预测模型。
方法
纳入监测、流行病学和最终结果(SEER)数据库中2010年至2018年诊断为NSCLC的患者。为提高研究的可扩展性,本研究还纳入了2007年1月至2016年12月在南昌大学第一附属医院首次诊断为NSCLC的患者数据。通过单因素和多因素逻辑回归筛选NSCLC中BM的独立危险因素。在此基础上,我们选择了六种常用的机器学习算法来构建预测模型,包括逻辑回归(LR)、决策树(DT)、随机森林(RF)、梯度提升机(GBM)、朴素贝叶斯分类器(NBC)和极端梯度提升(XGB)。然后,确定最佳模型以构建用于预测NSCLC患者BM的网络预测模型。最后,使用受试者操作特征曲线下面积(AUC)、准确性、敏感性和特异性来评估这些模型的性能。
结果
本研究共纳入50581例NSCLC患者,其中5087例(10.06%)发生BM。性别、分级、侧别、组织学、T分期、N分期和化疗是NSCLC的独立危险因素。在这六种模型中,由XGB算法构建的机器学习模型在内部和外部数据设置验证中表现最佳,AUC分数分别为0.808和0.841。然后,使用XGB算法构建了一个来自NSCLC的BM网络预测模型。
结论
本研究开发了一种基于XGB算法的网络预测模型,用于预测NSCLC患者的BM风险,这可能有助于医生进行临床决策。