• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 SEER 数据库的机器学习在第二原发乳腺癌患者生存预测中的分析。

Survival prediction in second primary breast cancer patients with machine learning: An analysis of SEER database.

机构信息

School of Public Health, Xiamen University, Xiang'an South Road, Xiang'an District, Xiamen, Fujian 361102, China; Key Laboratory of Health Technology Assessment of Fujian Province, Xiamen, Fujian, China; School of Nursing, Faculty of Health and Social Sciences, The Hong Kong Polytechnic University, Hong Kong SAR, China.

School of Public Health, Xiamen University, Xiang'an South Road, Xiang'an District, Xiamen, Fujian 361102, China; Key Laboratory of Health Technology Assessment of Fujian Province, Xiamen, Fujian, China.

出版信息

Comput Methods Programs Biomed. 2024 Sep;254:108310. doi: 10.1016/j.cmpb.2024.108310. Epub 2024 Jun 25.

DOI:10.1016/j.cmpb.2024.108310
PMID:38996803
Abstract

BACKGROUND

Studies have found that first primary cancer (FPC) survivors are at high risk of developing second primary breast cancer (SPBC). However, there is a lack of prognostic studies specifically focusing on patients with SPBC.

METHODS

This retrospective study used data from Surveillance, Epidemiology and End Results Program. We selected female FPC survivors diagnosed with SPBC from 12 registries (from January 1998 to December 2018) to construct prognostic models. Meanwhile, SPBC patients selected from another five registries (from January 2010 to December 2018) were used as the validation set to test the model's generalization ability. Four machine learning models and a Cox proportional hazards regression (CoxPH) were constructed to predict the overall survival of SPBC patients. Univariate and multivariate Cox regression analyses were used for feature selection. Model performance was assessed using time-dependent area under the ROC curve (t-AUC) and integrated Brier score (iBrier).

RESULTS

A total of 10,321 female FPC survivors with SPBC (mean age [SD]: 66.03 [11.17]) were included for model construction. These patients were randomly split into a training set (mean age [SD]: 65.98 [11.15]) and a test set (mean age [SD]: 66.15 [11.23]) with a ratio of 7:3. In validation set, a total of 3,638 SPBC patients (mean age [SD]: 66.28 [10.68]) were finally enrolled. Sixteen features were selected for model construction through univariate and multivariable Cox regression analyses. Among five models, random survival forest model showed excellent performance with a t-AUC of 0.805 (95 %CI: 0.803 - 0.807) and an iBrier of 0.123 (95 %CI: 0.122 - 0.124) on testing set, as well as a t-AUC of 0.803 (95 %CI: 0.801 - 0.807) and an iBrier of 0.098 (95 %CI: 0.096 - 0.103) on validation set. Through feature importance ranking, the top one and other top five key predictive features of the random survival forest model were identified, namely age, stage, regional nodes positive, latency, radiotherapy, and surgery.

CONCLUSIONS

The random survival forest model outperformed CoxPH and other machine learning models in predicting the overall survival of patients with SPBC, which was helpful for the monitoring of high-risk populations.

摘要

背景

研究发现,首发原发性癌症(FPC)幸存者发生第二原发乳腺癌(SPBC)的风险较高。然而,目前缺乏专门针对 SPBC 患者的预后研究。

方法

本回顾性研究使用了来自监测、流行病学和最终结果计划(Surveillance, Epidemiology and End Results Program)的数据。我们从 12 个登记处(1998 年 1 月至 2018 年 12 月)中选择了诊断为 SPBC 的 FPC 女性幸存者,构建了预后模型。同时,从另外 5 个登记处(2010 年 1 月至 2018 年 12 月)中选择了 SPBC 患者作为验证集,以测试模型的泛化能力。构建了四个机器学习模型和一个 Cox 比例风险回归(CoxPH)来预测 SPBC 患者的总体生存率。使用单变量和多变量 Cox 回归分析进行特征选择。使用时间依赖性 ROC 曲线下面积(t-AUC)和综合 Brier 评分(iBrier)评估模型性能。

结果

共纳入 10321 名 FPC 伴 SPBC 的女性(平均年龄[标准差]:66.03[11.17])用于模型构建。这些患者被随机分为训练集(平均年龄[标准差]:65.98[11.15])和测试集(平均年龄[标准差]:66.15[11.23]),比例为 7:3。在验证集中,最终纳入了 3638 名 SPBC 患者(平均年龄[标准差]:66.28[10.68])。通过单变量和多变量 Cox 回归分析,选择了 16 个特征进行模型构建。在五个模型中,随机生存森林模型在测试集上表现出色,t-AUC 为 0.805(95%CI:0.803-0.807),iBrier 为 0.123(95%CI:0.122-0.124),在验证集上,t-AUC 为 0.803(95%CI:0.801-0.807),iBrier 为 0.098(95%CI:0.096-0.103)。通过特征重要性排名,确定了随机生存森林模型的前一个和其他前五个关键预测特征,即年龄、分期、区域淋巴结阳性、潜伏期、放疗和手术。

结论

随机生存森林模型在预测 SPBC 患者的总体生存率方面优于 CoxPH 和其他机器学习模型,有助于对高危人群的监测。

相似文献

1
Survival prediction in second primary breast cancer patients with machine learning: An analysis of SEER database.基于 SEER 数据库的机器学习在第二原发乳腺癌患者生存预测中的分析。
Comput Methods Programs Biomed. 2024 Sep;254:108310. doi: 10.1016/j.cmpb.2024.108310. Epub 2024 Jun 25.
2
Deep learning models for predicting the survival of patients with hepatocellular carcinoma based on a surveillance, epidemiology, and end results (SEER) database analysis.基于监测、流行病学和最终结果(SEER)数据库分析的肝细胞癌患者生存预测的深度学习模型。
Sci Rep. 2024 Jun 9;14(1):13232. doi: 10.1038/s41598-024-63531-9.
3
Explainable machine learning predicts survival of retroperitoneal liposarcoma: A study based on the SEER database and external validation in China.可解释机器学习预测腹膜后脂肪肉瘤的生存:基于 SEER 数据库的研究和中国的外部验证。
Cancer Med. 2024 Jun;13(11):e7324. doi: 10.1002/cam4.7324.
4
Deep learning model for predicting the survival of patients with primary gastrointestinal lymphoma based on the SEER database and a multicentre external validation cohort.基于监测、流行病学和最终结果(SEER)数据库及多中心外部验证队列的预测原发性胃肠道淋巴瘤患者生存情况的深度学习模型
J Cancer Res Clin Oncol. 2023 Oct;149(13):12177-12189. doi: 10.1007/s00432-023-05123-0. Epub 2023 Jul 10.
5
Machine learning-based individualized survival prediction model for prognosis in osteosarcoma: Data from the SEER database.基于机器学习的骨肉瘤个体化生存预测模型:来自 SEER 数据库的数据。
Medicine (Baltimore). 2024 Sep 27;103(39):e39582. doi: 10.1097/MD.0000000000039582.
6
Risk of second primary breast cancer among cancer survivors: Implications for prevention and screening practice.癌症幸存者中第二原发乳腺癌的风险:对预防和筛查实践的影响。
PLoS One. 2020 Jun 4;15(6):e0232800. doi: 10.1371/journal.pone.0232800. eCollection 2020.
7
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
8
The Application and Comparison of Machine Learning Models for the Prediction of Breast Cancer Prognosis: Retrospective Cohort Study.机器学习模型在乳腺癌预后预测中的应用与比较:回顾性队列研究
JMIR Med Inform. 2022 Feb 18;10(2):e33440. doi: 10.2196/33440.
9
Risk factors and prognostic nomogram for patients with second primary cancers after lung cancer using classical statistics and machine learning.利用经典统计学和机器学习方法分析肺癌患者发生第二原发性肿瘤的风险因素和预后列线图。
Clin Exp Med. 2023 Sep;23(5):1609-1620. doi: 10.1007/s10238-022-00858-5. Epub 2022 Jul 11.
10
Risk, molecular subtype and prognosis of second primary breast cancer: an analysis based on first primary cancers.第二原发性乳腺癌的风险、分子亚型及预后:基于第一原发性癌症的分析
Am J Cancer Res. 2023 Jul 15;13(7):3203-3220. eCollection 2023.

引用本文的文献

1
Decoding the gut microbiota metabolite-matrix metalloproteinase-3 axis in breast cancer: a multi-omics and network pharmacology study.解析乳腺癌中肠道微生物群代谢产物-基质金属蛋白酶-3轴:一项多组学和网络药理学研究
Mol Divers. 2025 Sep 14. doi: 10.1007/s11030-025-11351-y.
2
Development and validation of nomograms for predicting survival of locally advanced rectosigmoid junction cancer patients: a SEER database analysis.预测局部晚期直肠乙状结肠交界处癌患者生存的列线图的开发与验证:一项监测、流行病学和最终结果(SEER)数据库分析
Transl Cancer Res. 2025 May 30;14(5):2808-2821. doi: 10.21037/tcr-24-1810. Epub 2025 May 27.
3
Survival prediction from imbalanced colorectal cancer dataset using hybrid sampling methods and tree-based classifiers.
使用混合采样方法和基于树的分类器对不均衡结直肠癌数据集进行生存预测。
Sci Rep. 2025 Apr 25;15(1):14554. doi: 10.1038/s41598-025-98703-8.
4
Comparing Random Survival Forests and Cox Regression for Nonresponders to Neoadjuvant Chemotherapy Among Patients With Breast Cancer: Multicenter Retrospective Cohort Study.比较随机生存森林模型和Cox回归在乳腺癌患者新辅助化疗无反应者中的应用:多中心回顾性队列研究
J Med Internet Res. 2025 Apr 8;27:e69864. doi: 10.2196/69864.