Suppr超能文献

乳腺癌患者发生多原发性癌症的危险因素:一项回顾性研究及机器学习模型的建立/测试

Risk factors of breast cancer patients developing multiple primary cancers: a retrospective study and establishing/testing of machine learning models.

作者信息

Jin Yudi, Su Tong, Fan Yanjia, Zheng Yineng, Tian Cheng, Ouyang Zubin, Lv Fajin

机构信息

Department of Radiology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, 400016, China.

Department of Breast and Thyroid Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China.

出版信息

BMC Med Inform Decis Mak. 2025 Jul 25;25(1):277. doi: 10.1186/s12911-025-03086-5.

Abstract

BACKGROUND

Breast cancer is a prevalent malignancy globally, with approximately 1 in 10 breast cancer patients at risk of developing additional primary malignant tumors. This study seeks to explore the risk factors linked to the development of multiple primary cancers (MPCs) in breast cancer patients and to develop predictive models to aid in clinical decision-making.

METHODS

A cohort of patients from the Surveillance, Epidemiology, and End Results (SEER) database was analyzed to identify key factors contributing to the occurrence of MPCs. Machine learning models, including logistic regression and random forest, were established and tested to predict the risk of developing multiple primary cancers.

RESULTS

A total of 120,434 breast cancer patients were included in the study. After random undersampling of the majority calss and random selected a quarter of populations, there were 3432 patients in each of the one primary breast cancer (OPBC) group and the MPCs group. A logistic regression and a random forest model were constructed based on age, marital status, laterality, histological type, tumor grade, American Joint Committee on Cance (AJCC) stage, T and N stage, molecular subtype, surgery, chemotherapy, and radiotherapy. The logistic regression model achieved an area under the curve (AUC) of 0.902, a specificity of 0.905, and a sensitivity of 0.767 in the training set, and an AUC of 0.886, a specificity of 0.882, and a sensitivity of 0.782 In the testing set. The random forest model achieved an AUC of 0.955, a specificity of 0.916, and a sensitivity of 0.859 in the training set, and an AUC of 0.874, a specificity of 0.858, and a sensitivity of 0.769 in the testing set. A nomogram was plotted based on the logistic regression model. The Kaplan-Meier (K-M) curves demonstrated statistically significant differences in prognosis among the various risk groups that were stratified based on the nomogram.

CONCLUSIONS

This study assessed several risk factors influencing the development of MPCs in breast cancer patients. The machine learning model could offer a practical tool for personalized risk assessment in this patient population.

摘要

背景

乳腺癌是全球一种常见的恶性肿瘤,约十分之一的乳腺癌患者有发生其他原发性恶性肿瘤的风险。本研究旨在探讨与乳腺癌患者发生多原发性癌症(MPCs)相关的危险因素,并开发预测模型以辅助临床决策。

方法

对来自监测、流行病学和最终结果(SEER)数据库的一组患者进行分析,以确定导致MPCs发生的关键因素。建立并测试了包括逻辑回归和随机森林在内的机器学习模型,以预测发生多原发性癌症的风险。

结果

本研究共纳入120434例乳腺癌患者。在对多数类进行随机欠采样并随机选择四分之一的人群后,单原发性乳腺癌(OPBC)组和MPCs组各有3432例患者。基于年龄、婚姻状况、患侧、组织学类型、肿瘤分级、美国癌症联合委员会(AJCC)分期、T和N分期、分子亚型、手术、化疗和放疗构建了逻辑回归模型和随机森林模型。逻辑回归模型在训练集中的曲线下面积(AUC)为0.902,特异性为0.905,敏感性为0.767;在测试集中的AUC为0.886,特异性为0.882,敏感性为0.782。随机森林模型在训练集中的AUC为0.955,特异性为0.916,敏感性为0.859;在测试集中的AUC为0.874,特异性为0.858,敏感性为0.769。基于逻辑回归模型绘制了列线图。Kaplan-Meier(K-M)曲线显示,根据列线图分层的不同风险组之间的预后存在统计学显著差异。

结论

本研究评估了影响乳腺癌患者发生MPCs的几个危险因素。机器学习模型可为该患者群体进行个性化风险评估提供实用工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验