Suppr超能文献

基于机器学习方法的甲状腺乳头状癌结构复发预测模型的建立与验证。

Development and validation of prediction models for papillary thyroid cancer structural recurrence using machine learning approaches.

机构信息

Department of Nuclear Medicine, West China Hospital, Sichuan University, No 37. Guoxue Alley, 610041, Chengdu, China.

West China Biomedical Big Data Center, West China Hospital, Sichuan University, 610041, Chengdu, China.

出版信息

BMC Cancer. 2024 Apr 8;24(1):427. doi: 10.1186/s12885-024-12146-4.

Abstract

BACKGROUND

Although papillary thyroid cancer (PTC) patients are known to have an excellent prognosis, up to 30% of patients experience disease recurrence after initial treatment. Accurately predicting disease prognosis remains a challenge given that the predictive value of several predictors remains controversial. Thus, we investigated whether machine learning (ML) approaches based on comprehensive predictors can predict the risk of structural recurrence for PTC patients.

METHODS

A total of 2244 patients treated with thyroid surgery and radioiodine were included. Twenty-nine perioperative variables consisting of four dimensions (demographic characteristics and comorbidities, tumor-related variables, lymph node (LN)-related variables, and metabolic and inflammatory markers) were analyzed. We applied five ML algorithms-logistic regression (LR), support vector machine (SVM), extreme gradient boosting (XGBoost), random forest (RF), and neural network (NN)-to develop the models. The area under the receiver operating characteristic (AUC-ROC) curve, calibration curve, and variable importance were used to evaluate the models' performance.

RESULTS

During a median follow-up of 45.5 months, 179 patients (8.0%) experienced structural recurrence. The non-stimulated thyroglobulin, LN dissection, number of LNs dissected, lymph node metastasis ratio, N stage, comorbidity of hypertension, comorbidity of diabetes, body mass index, and low-density lipoprotein were used to develop the models. All models showed a greater AUC (AUC = 0.738 to 0.767) than did the ATA risk stratification (AUC = 0.620, DeLong test: P < 0.01). The SVM, XGBoost, and RF model showed greater sensitivity (0.568, 0.595, 0.676), specificity (0.903, 0.857, 0.784), accuracy (0.875, 0.835, 0.775), positive predictive value (PPV) (0.344, 0.272, 0.219), negative predictive value (NPV) (0.959, 0.959, 0.964), and F1 score (0.429, 0.373, 0.331) than did the ATA risk stratification (sensitivity = 0.432, specificity = 0.770, accuracy = 0.742, PPV = 0.144, NPV = 0.938, F1 score = 0.216). The RF model had generally consistent calibration compared with the other models. The Tg and the LNR were the top 2 important variables in all the models, the N stage was the top 5 important variables in all the models.

CONCLUSIONS

The RF model achieved the expected prediction performance with generally good discrimination, calibration and interpretability in this study. This study sheds light on the potential of ML approaches for improving the accuracy of risk stratification for PTC patients.

TRIAL REGISTRATION

Retrospectively registered at www.chictr.org.cn (trial registration number: ChiCTR2300075574, date of registration: 2023-09-08).

摘要

背景

尽管甲状腺乳头状癌(PTC)患者的预后通常较好,但仍有高达 30%的患者在初始治疗后出现疾病复发。由于一些预测因子的预测价值仍存在争议,因此准确预测疾病预后仍然是一个挑战。因此,我们研究了基于综合预测因子的机器学习(ML)方法是否可以预测 PTC 患者结构复发的风险。

方法

共纳入 2244 例接受甲状腺手术和放射性碘治疗的患者。分析了 29 个围手术期变量,包括四个维度(人口统计学特征和合并症、肿瘤相关变量、淋巴结(LN)相关变量以及代谢和炎症标志物)。我们应用了五种 ML 算法-逻辑回归(LR)、支持向量机(SVM)、极端梯度提升(XGBoost)、随机森林(RF)和神经网络(NN)-来开发模型。使用受试者工作特征曲线下面积(AUC-ROC)、校准曲线和变量重要性来评估模型的性能。

结果

在中位随访 45.5 个月期间,179 例(8.0%)患者发生结构复发。未刺激甲状腺球蛋白、LN 解剖、解剖的 LN 数量、淋巴结转移比、N 分期、高血压合并症、糖尿病合并症、体重指数、低密度脂蛋白被用于开发模型。所有模型的 AUC(AUC=0.738 至 0.767)均大于 ATA 风险分层(AUC=0.620,DeLong 检验:P<0.01)。SVM、XGBoost 和 RF 模型的灵敏度(0.568、0.595、0.676)、特异性(0.903、0.857、0.784)、准确性(0.875、0.835、0.775)、阳性预测值(PPV)(0.344、0.272、0.219)、阴性预测值(NPV)(0.959、0.959、0.964)和 F1 评分(0.429、0.373、0.331)均大于 ATA 风险分层(灵敏度=0.432,特异性=0.770,准确性=0.742,PPV=0.144,NPV=0.938,F1 评分=0.216)。RF 模型与其他模型相比,总体上具有一致的校准效果。Tg 和 LNR 是所有模型中最重要的前 2 个变量,N 分期是所有模型中最重要的前 5 个变量。

结论

本研究中,RF 模型在一般的区分度、校准和可解释性方面达到了预期的预测性能。本研究为 ML 方法在改善 PTC 患者风险分层准确性方面的应用提供了思路。

临床试验注册

www.chictr.org.cn(ChiCTR2300075574,登记日期:2023-09-08)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9d5/11000372/6fcbba5e6cdc/12885_2024_12146_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验