Guo Zhen-Tian, Tian Kun, Xie Xi-Yuan, Zhang Yu-Hang, Fang De-Bao
Department of General Surgery, Beijing Electric Power Hospital, State Grid Corporation China, Capital Medical University, Beijing 100073, China.
Fujian Provincial Hospital, Fuzhou, Fujian 350001, China.
Int J Endocrinol. 2023 Dec 30;2023:9965578. doi: 10.1155/2023/9965578. eCollection 2023.
We aimed to establish an effective machine learning (ML) model for predicting the risk of distant metastasis (DM) in medullary thyroid carcinoma (MTC).
Demographic data of MTC patients were extracted from the Surveillance, Epidemiology, and End Results (SEER) database of the National Institutes of Health between 2004 and 2015 to develop six ML algorithm models. Models were evaluated based on accuracy, precision, recall rate, 1-score, and area under the receiver operating characteristic curve (AUC). The association between clinicopathological characteristics and target variables was interpreted. Analyses were performed using traditional logistic regression (LR).
In total, 2049 patients were included and 138 developed DM. Multivariable LR showed that age, sex, tumor size, extrathyroidal extension, and lymph node metastasis were predictive features for DM in MTC. Among the six ML models, the random forest (RF) had the best predictability in assessing the risk of DM in MTC, with an accuracy, precision, recall rate, 1-score, and AUC higher than those of the traditional binary LR model.
RF was superior to traditional LR in predicting the risk of DM in MTC and can provide a valuable reference for clinicians in decision-making.
我们旨在建立一种有效的机器学习(ML)模型,用于预测甲状腺髓样癌(MTC)远处转移(DM)的风险。
从美国国立卫生研究院2004年至2015年的监测、流行病学和最终结果(SEER)数据库中提取MTC患者的人口统计学数据,以开发六种ML算法模型。基于准确性、精确性、召回率、F1分数和受试者操作特征曲线下面积(AUC)对模型进行评估。解读临床病理特征与目标变量之间的关联。使用传统逻辑回归(LR)进行分析。
共纳入2049例患者,其中138例发生DM。多变量LR显示,年龄、性别、肿瘤大小、甲状腺外侵犯和淋巴结转移是MTC中DM的预测特征。在六种ML模型中,随机森林(RF)在评估MTC中DM风险方面具有最佳的预测能力,其准确性、精确性、召回率、F1分数和AUC均高于传统二元LR模型。
在预测MTC中DM风险方面,RF优于传统LR,可为临床医生决策提供有价值的参考。