• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于两个数据库的预测分化型甲状腺癌肺转移风险的机器学习模型的开发与验证

Development and validation of machine learning models for predicting lung metastasis risk in differentiated thyroid cancer based on two databases.

作者信息

Shen Haolin, Yang Caiyun, Wang Yuegui, Liao Jianmei, Zuo Xianbo, Zhang Bo, Yang Xiao

机构信息

Department of Ultrasound, Zhangzhou Municipal Hospital Affiliated to Fujian Medical University, Zhangzhou, China.

Department of Dermatology, China-Japan Friendship Hospital, Beijing, China.

出版信息

Gland Surg. 2024 Nov 30;13(11):2174-2188. doi: 10.21037/gs-24-481. Epub 2024 Nov 26.

DOI:10.21037/gs-24-481
PMID:39678420
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11635582/
Abstract

BACKGROUND

Differentiated thyroid cancer (DTC) progresses slowly, but patients with lung metastasis (LM) have a poor prognosis. The aim of this study was to develop and evaluate the predictive ability of machine learning (ML) models in estimating the risk of LM in patients with DTC and to identify the independent risk factors specific to different age and gender subgroups.

METHODS

The demographic and clinicopathological data of patients with DTC were obtained from two databases: firstly, the National Institutes of Health Surveillance, Epidemiology, and End Results (SEER) database [2010-2015], which provides extensive epidemiological and clinical information on cancer patients; secondly, the Zhangzhou Municipal Hospital Affiliated to Fujian Medical University [2014-2017], which focuses more on patients' specific clinicopathological characteristics and treatment outcomes. Common variables from both databases were extracted. The data were then split into training, testing and validation sets. The training set was used to build and train ML models, while the testing and validation set were employed to assess the performance of these models. In terms of model development, we established five different ML models: logistic regression (LR), random forest (RF), decision tree (DT), extreme gradient boosting (XGBoost), and gradient boosting machine (GBM). For model validation, we utilized various evaluation metrics, including accuracy, precision, recall, F1 score, Brier score, area under the receiver operating characteristic (ROC) curve (AUROC), area under the precision-recall (PR) curve (PR-AUC), calibration curve, and decision curve analysis (DCA). The importance of various features was ranked and visualized for the top-performing models.

RESULTS

The analysis identified age, gender, tumor size, T stage, N stage, and histologic type as significant independent risk factors for LM. The effects of gender, T stage, and histological type on the risk of LM varied across the different age subgroups. In the female population, tumor size was an independent risk factor for LM, while it was not in the male population. GBM achieved an AUROC of 0.982, a Brier score of 0.047, an accuracy of 0.818, and an F1 score of 0.818 in the validation set, outperforming the other models.

CONCLUSIONS

The GBM model emerged as an effective tool for identifying high-risk LM populations in DTC, with the potential to guide clinical practice and facilitate the development of individualized treatment plans. Further research to validate these findings across more diverse patient populations and clinical settings is recommended.

摘要

背景

分化型甲状腺癌(DTC)进展缓慢,但发生肺转移(LM)的患者预后较差。本研究的目的是开发并评估机器学习(ML)模型在预测DTC患者发生LM风险方面的能力,并确定不同年龄和性别亚组特有的独立危险因素。

方法

DTC患者的人口统计学和临床病理数据来自两个数据库:首先是美国国立卫生研究院监测、流行病学和最终结果(SEER)数据库[2010 - 2015年],该数据库提供了癌症患者广泛的流行病学和临床信息;其次是福建医科大学附属漳州市医院[2014 - 2017年],该数据库更侧重于患者的特定临床病理特征和治疗结果。提取两个数据库中的共同变量。然后将数据分为训练集、测试集和验证集。训练集用于构建和训练ML模型,而测试集和验证集用于评估这些模型的性能。在模型开发方面,我们建立了五种不同的ML模型:逻辑回归(LR)、随机森林(RF)、决策树(DT)、极端梯度提升(XGBoost)和梯度提升机(GBM)。对于模型验证,我们使用了各种评估指标,包括准确性、精确性、召回率、F1分数、布里尔分数、受试者操作特征(ROC)曲线下面积(AUROC)、精确召回(PR)曲线下面积(PR - AUC)、校准曲线和决策曲线分析(DCA)。对表现最佳的模型对各种特征的重要性进行排序并可视化。

结果

分析确定年龄、性别、肿瘤大小、T分期、N分期和组织学类型是LM的重要独立危险因素。性别、T分期和组织学类型对LM风险的影响在不同年龄亚组中有所不同。在女性人群中,肿瘤大小是LM的独立危险因素,而在男性人群中则不是。GBM在验证集中的AUROC为0.982,布里尔分数为0.047,准确性为0.818,F1分数为0.818,优于其他模型。

结论

GBM模型成为识别DTC中高风险LM人群的有效工具,有可能指导临床实践并促进个体化治疗方案的制定。建议进一步开展研究,在更多样化的患者群体和临床环境中验证这些发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fb1/11635582/33327bfcb95c/gs-13-11-2174-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fb1/11635582/ac8eb1445351/gs-13-11-2174-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fb1/11635582/72d85c814c80/gs-13-11-2174-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fb1/11635582/33327bfcb95c/gs-13-11-2174-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fb1/11635582/ac8eb1445351/gs-13-11-2174-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fb1/11635582/72d85c814c80/gs-13-11-2174-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fb1/11635582/33327bfcb95c/gs-13-11-2174-f3.jpg

相似文献

1
Development and validation of machine learning models for predicting lung metastasis risk in differentiated thyroid cancer based on two databases.基于两个数据库的预测分化型甲状腺癌肺转移风险的机器学习模型的开发与验证
Gland Surg. 2024 Nov 30;13(11):2174-2188. doi: 10.21037/gs-24-481. Epub 2024 Nov 26.
2
Prediction of lung metastases in thyroid cancer using machine learning based on SEER database.基于 SEER 数据库的机器学习预测甲状腺癌肺转移。
Cancer Med. 2022 Jun;11(12):2503-2515. doi: 10.1002/cam4.4617. Epub 2022 Feb 22.
3
Machine learning based on SEER database to predict distant metastasis of thyroid cancer.基于 SEER 数据库的机器学习预测甲状腺癌的远处转移。
Endocrine. 2024 Jun;84(3):1040-1050. doi: 10.1007/s12020-023-03657-4. Epub 2023 Dec 29.
4
Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms.使用机器学习算法开发并验证针对20岁及以上抑郁症患者冠心病风险的预测模型。
Front Cardiovasc Med. 2025 Jan 9;11:1504957. doi: 10.3389/fcvm.2024.1504957. eCollection 2024.
5
Model development and validation for predicting small-cell lung cancer bone metastasis utilizing diverse machine learning algorithms based on the SEER database.基于监测、流行病学和最终结果(SEER)数据库,利用多种机器学习算法预测小细胞肺癌骨转移的模型开发与验证
Medicine (Baltimore). 2025 Mar 21;104(12):e41987. doi: 10.1097/MD.0000000000041987.
6
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
7
[Constructing a predictive model for the death risk of patients with septic shock based on supervised machine learning algorithms].基于监督机器学习算法构建脓毒症休克患者死亡风险预测模型
Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2024 Apr;36(4):345-352. doi: 10.3760/cma.j.cn121430-20230930-00832.
8
Machine learning-based predictive models for perioperative major adverse cardiovascular events in patients with stable coronary artery disease undergoing noncardiac surgery.基于机器学习的预测模型用于接受非心脏手术的稳定冠状动脉疾病患者围手术期主要不良心血管事件的预测
Comput Methods Programs Biomed. 2025 Mar;260:108561. doi: 10.1016/j.cmpb.2024.108561. Epub 2024 Dec 13.
9
[Construction of a predictive model for in-hospital mortality of sepsis patients in intensive care unit based on machine learning].基于机器学习构建重症监护病房脓毒症患者院内死亡率预测模型
Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2023 Jul;35(7):696-701. doi: 10.3760/cma.j.cn121430-20221219-01104.
10
An External-Validated Prediction Model to Predict Lung Metastasis among Osteosarcoma: A Multicenter Analysis Based on Machine Learning.基于机器学习的骨肉瘤肺转移的外部验证预测模型:多中心分析。
Comput Intell Neurosci. 2022 May 6;2022:2220527. doi: 10.1155/2022/2220527. eCollection 2022.

本文引用的文献

1
Machine learning based androgen receptor regulatory gene-related random forest survival model for precise treatment decision in prostate cancer.基于机器学习的雄激素受体调控基因相关随机森林生存模型用于前列腺癌的精准治疗决策
Heliyon. 2024 Sep 2;10(17):e37256. doi: 10.1016/j.heliyon.2024.e37256. eCollection 2024 Sep 15.
2
Establishment and verification of the first prognostic nomograms in locally advanced thyroid cancer based on the analysis of clinical and follow-up information on 2396 patients.基于对2396例患者的临床和随访信息分析,建立并验证首个局部晚期甲状腺癌预后列线图。
Heliyon. 2024 Jan 30;10(3):e24798. doi: 10.1016/j.heliyon.2024.e24798. eCollection 2024 Feb 15.
3
Nomogram predicts risk and prognostic factors for lung metastasis of anaplastic thyroid carcinoma: a retrospective study in the Surveillance Epidemiology and End Results (SEER) database.
列线图预测间变性甲状腺癌肺转移的风险及预后因素:一项基于监测、流行病学和最终结果(SEER)数据库的回顾性研究
Transl Cancer Res. 2023 Dec 31;12(12):3547-3564. doi: 10.21037/tcr-23-1195. Epub 2023 Dec 7.
4
Machine learning based on SEER database to predict distant metastasis of thyroid cancer.基于 SEER 数据库的机器学习预测甲状腺癌的远处转移。
Endocrine. 2024 Jun;84(3):1040-1050. doi: 10.1007/s12020-023-03657-4. Epub 2023 Dec 29.
5
Sex hormone signaling and regulation of immune function.性激素信号转导与免疫功能调节。
Immunity. 2023 Nov 14;56(11):2472-2491. doi: 10.1016/j.immuni.2023.10.008.
6
Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study.用于甲状腺癌患者风险分层的机器学习:一项15年队列研究。
Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2095-2104. doi: 10.1007/s00405-023-08299-w. Epub 2023 Oct 30.
7
How to use the Surveillance, Epidemiology, and End Results (SEER) data: research design and methodology.如何使用监测、流行病学和最终结果(SEER)数据:研究设计和方法。
Mil Med Res. 2023 Oct 31;10(1):50. doi: 10.1186/s40779-023-00488-2.
8
Risk factors for death of follicular thyroid carcinoma: a systematic review and meta-analysis.滤泡性甲状腺癌死亡的危险因素:系统评价和荟萃分析。
Endocrine. 2023 Dec;82(3):457-466. doi: 10.1007/s12020-023-03466-9. Epub 2023 Oct 7.
9
Development and validation of nomograms for predicting the risk of central lymph node metastasis of solitary papillary thyroid carcinoma of the isthmus.颈中部孤立性甲状腺乳头状癌中央区淋巴结转移风险预测列线图的建立与验证。
J Cancer Res Clin Oncol. 2023 Nov;149(16):14853-14868. doi: 10.1007/s00432-023-05146-7. Epub 2023 Aug 20.
10
Development and validation of a nomogram for risk of pulmonary metastasis in non-papillary thyroid carcinoma: A SEER-based study.基于 SEER 数据库的研究:非乳头状甲状腺癌肺转移风险列线图的建立与验证。
Medicine (Baltimore). 2023 Aug 11;102(32):e34581. doi: 10.1097/MD.0000000000034581.