Thyroid Surgery Department, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
Endocrine. 2024 Jun;84(3):1040-1050. doi: 10.1007/s12020-023-03657-4. Epub 2023 Dec 29.
Distant metastasis of thyroid cancer often indicates poor prognosis, and it is important to identify patients who have developed distant metastasis or are at high risk as early as possible. This paper aimed to predict distant metastasis of thyroid cancer through the construction of machine learning models to provide a reference for clinical diagnosis and treatment.
MATERIALS & METHODS: Data on demographic and clinicopathological characteristics of thyroid cancer patients between 2010 and 2015 were extracted from the National Institutes of Health (NIH) Surveillance, Epidemiology, and End Results (SEER) database. Our research used univariate and multivariate logistic models to screen independent risk factors, respectively. Decision Trees (DT), ElasticNet (ENET), Logistic Regression (LR), Extreme Gradient Boosting (XGBoost), Random Forest (RF), Multilayer Perceptron (MLP), Radial Basis Function Support Vector Machine (RBFSVM) and seven machine learning models were compared and evaluated by the following metrics: the area under receiver operating characteristic curve (AUC), calibration curve, decision curve analysis (DCA), sensitivity(also called recall), specificity, precision, accuracy and F1 score. Interpretable machine learning was used to identify possible correlation between variables and distant metastasis.
Independent risk factors for distant metastasis, including age, gender, race, marital status, histological type, capsular invasion, and number of lymph nodes metastases were screened by multifactorial regression analysis. Among the seven machine learning algorithms, RF was the best algorithm, with an AUC of 0.948, sensitivity of 0.919, accuracy of 0.845, and F1 score of 0.886 in the training set, and an AUC of 0.960, sensitivity of 0.929, accuracy of 0.906, and F1 score of 0.908 in the test set.
The machine learning model constructed in this study helps in the early diagnosis of distant thyroid metastases and helps physicians to make better decisions and medical interventions.
甲状腺癌的远处转移通常预示着不良预后,因此尽早识别已发生远处转移或处于高风险的患者非常重要。本研究旨在构建机器学习模型预测甲状腺癌的远处转移,为临床诊断和治疗提供参考。
从美国国立卫生研究院(NIH)监测、流行病学和最终结果(SEER)数据库中提取 2010 年至 2015 年间甲状腺癌患者的人口统计学和临床病理特征数据。本研究分别采用单变量和多变量逻辑模型筛选独立危险因素。采用决策树(DT)、弹性网络(ENET)、逻辑回归(LR)、极端梯度提升(XGBoost)、随机森林(RF)、多层感知机(MLP)、径向基函数支持向量机(RBFSVM)和七种机器学习模型进行比较和评估,评估指标包括受试者工作特征曲线下面积(AUC)、校准曲线、决策曲线分析(DCA)、敏感度(也称召回率)、特异度、精确度、准确度和 F1 评分。采用可解释的机器学习方法识别变量与远处转移之间的可能相关性。
多因素回归分析筛选出远处转移的独立危险因素,包括年龄、性别、种族、婚姻状况、组织学类型、包膜侵犯和淋巴结转移数量。在七种机器学习算法中,RF 算法的表现最佳,在训练集和测试集中的 AUC 分别为 0.948、0.919、0.845、0.886 和 0.960、0.929、0.906、0.908。
本研究构建的机器学习模型有助于甲状腺癌远处转移的早期诊断,有助于医生做出更好的决策和医疗干预。