• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于临床数据的机器学习聚类预测三阴性乳腺癌患者的总生存期和无复发生存期

Prediction of Overall and Relapse-Free Survival in Triple-Negative Breast Cancer Patients Through Machine Learning-Based Clustering on Clinical Data.

作者信息

Alzate-Granados Juan Pablo, Niño Luis Fernando

机构信息

Universidad Nacional de Colombia - Sede Bogotá, Facultad de Medicina - Depto. de Patología. Doctorado en Oncología, Bogotá, Colombia.

Universidad Nacional de Colombia - Sede Bogotá, Facultad de Ingeniería - Depto. de Ingeniería de Sistemas e Industrial. Grupo de Investigación LISI, Bogotá, Colombia.

出版信息

Clin Breast Cancer. 2025 Oct;25(7):714-719. doi: 10.1016/j.clbc.2025.07.027. Epub 2025 Jul 29.

DOI:10.1016/j.clbc.2025.07.027
PMID:40849239
Abstract

INTRODUCTION

Triple-negative breast cancer (TNBC) accounts for 15% to 20% of breast cancer cases and is characterized by its aggressiveness and high relapse rate. Due to the absence of hormonal receptors and HER2, standard treatment relies on chemotherapy, yielding limited outcomes in overall survival (OS) and relapse-free survival (RFS). The molecular heterogeneity of TNBC complicates risk stratification and personalized treatment approaches. In this context, unsupervised machine learning could improve the identification of clinically homogeneous subgroups and facilitate prognostic predictions.

OBJECTIVE

To develop predictive models for OS and RFS in TNBC patients using machine learning algorithms, specifically k-prototypes for subgroup identification and random forest for outcome prediction.

METHODS

A retrospective cohort study was conducted on 4808 TNBC patients diagnosed between 2012 and 2024. Clinical, demographic, and biomolecular variables were analyzed from anonymized clinical records. The k-prototypes algorithm was applied to cluster patients into groups based on shared characteristics. Subsequently, predictive models using random forest were trained and evaluated through stratified cross-validation and metrics such as AUC, sensitivity, and specificity. Cox regression was used to identify risk factors associated with mortality and relapse.

RESULTS

Four clusters with distinct risk profiles were identified. Overall mortality was 28.8%, and relapse occurred in 40.9%, with a median follow-up time of 8.46 years. The highest-risk group exhibited a mortality rate of 42.3% and a relapse rate of 54.2%, associated with poorer functional status (ECOG ≥3) and a high prevalence of BRCA1/2 mutations (71%). The random forest model achieved 80% accuracy in mortality prediction (AUC = 0.78) and 75% accuracy in relapse prediction (AUC = 0.76). Factors such as the Charlson Comorbidity Index, ECOG, BRCA1/2 status, and PD-L1 expression were key determinants in outcome prediction.

DISCUSSION

The findings confirm the relevance of machine learning in TNBC stratification. A clinically meaningful classification was achieved, outperforming traditional models based solely on clinical or genomic variables. Comorbid burden and tumor biomarkers played crucial roles in outcome prediction. Despite its strengths, the study has limitations, including its retrospective nature and the absence of transcriptomic data. Prospective validation of these models could enhance their applicability in clinical practice.

摘要

引言

三阴性乳腺癌(TNBC)占乳腺癌病例的15%至20%,其特点是侵袭性强且复发率高。由于缺乏激素受体和HER2,标准治疗依赖化疗,在总生存期(OS)和无复发生存期(RFS)方面的效果有限。TNBC的分子异质性使风险分层和个性化治疗方法变得复杂。在此背景下,无监督机器学习可以改善临床同质亚组的识别并促进预后预测。

目的

使用机器学习算法,特别是用于亚组识别的k-原型算法和用于结果预测的随机森林算法,为TNBC患者开发OS和RFS的预测模型。

方法

对2012年至2024年间诊断的4808例TNBC患者进行了一项回顾性队列研究。从匿名临床记录中分析临床、人口统计学和生物分子变量。应用k-原型算法根据共同特征将患者聚类分组。随后,使用随机森林的预测模型通过分层交叉验证以及AUC、敏感性和特异性等指标进行训练和评估。使用Cox回归确定与死亡率和复发相关的风险因素。

结果

确定了四个具有不同风险特征的聚类。总死亡率为28.8%,复发率为40.9%,中位随访时间为8.46年。风险最高的组死亡率为42.3%,复发率为54.2%,与较差的功能状态(ECOG≥3)和较高的BRCA1/2突变患病率(71%)相关。随机森林模型在死亡率预测中的准确率达到80%(AUC = 0.78),在复发预测中的准确率达到75%(AUC = 0.76)。Charlson合并症指数、ECOG、BRCA1/2状态和PD-L1表达等因素是结果预测的关键决定因素。

讨论

研究结果证实了机器学习在TNBC分层中的相关性。实现了具有临床意义的分类,优于仅基于临床或基因组变量的传统模型。合并症负担和肿瘤生物标志物在结果预测中起关键作用。尽管有其优势,但该研究存在局限性,包括其回顾性性质以及缺乏转录组数据。对这些模型进行前瞻性验证可以提高它们在临床实践中的适用性。

相似文献

1
Prediction of Overall and Relapse-Free Survival in Triple-Negative Breast Cancer Patients Through Machine Learning-Based Clustering on Clinical Data.基于临床数据的机器学习聚类预测三阴性乳腺癌患者的总生存期和无复发生存期
Clin Breast Cancer. 2025 Oct;25(7):714-719. doi: 10.1016/j.clbc.2025.07.027. Epub 2025 Jul 29.
2
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Development of Machine Learning-based Algorithms to Predict the 2- and 5-year Risk of TKA After Tibial Plateau Fracture Treatment.基于机器学习的算法用于预测胫骨平台骨折治疗后2年和5年全膝关节置换风险的研究进展
Clin Orthop Relat Res. 2025 Mar 12. doi: 10.1097/CORR.0000000000003442.
5
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
6
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
7
Predicting Pathological Complete Response Following Neoadjuvant Therapy in Patients With Breast Cancer: Development of Machine Learning-Based Prediction Models in a Retrospective Study.预测乳腺癌患者新辅助治疗后的病理完全缓解:一项回顾性研究中基于机器学习的预测模型的开发
JMIR Cancer. 2025 Jul 18;11:e64685. doi: 10.2196/64685.
8
Development and validation of a machine learning-based survival prediction model for Asian glioblastoma patients using the SEER database and Chinese data.利用监测、流行病学和最终结果(SEER)数据库及中国数据开发并验证基于机器学习的亚洲胶质母细胞瘤患者生存预测模型
Sci Rep. 2025 Aug 24;15(1):31114. doi: 10.1038/s41598-025-15553-0.
9
Clinical diagnostic and prognostic value of homocysteine combined with hemoglobin [f (Hcy-Hb)] in cardio-renal syndrome caused by primary acute myocardial infarction.同型半胱氨酸联合血红蛋白[f(Hcy-Hb)]在原发性急性心肌梗死所致心肾综合征中的临床诊断及预后价值
J Transl Med. 2025 Jul 23;23(1):813. doi: 10.1186/s12967-025-06512-4.
10
Combined prognostic impact of initial clinical stage and residual cancer burden after neoadjuvant systemic therapy in triple-negative and HER2-positive breast cancer: an analysis of the I-SPY2 randomized clinical trial.三阴性和HER2阳性乳腺癌新辅助全身治疗后初始临床分期和残余癌负荷的联合预后影响:I-SPY2随机临床试验分析
Breast Cancer Res. 2025 Jun 23;27(1):115. doi: 10.1186/s13058-025-02070-1.