He Chongwu, Yu Tenghua, Yang Liu, He Longbo, Zhu Jin, Chen Jing
Department of Breast Surgery, The Second Affiliated Hospital of Nanchang Medical College, Jiangxi Cancer Hospital, Nanchang, Jiangxi Province, China.
Department of Pathology, Nanchang People's Hospital, Nanchang, Jiangxi Province, China.
BMC Cancer. 2025 May 23;25(1):933. doi: 10.1186/s12885-025-14335-1.
This study aimed to develop and validate machine learning models to predict pathological complete response (pCR) after neoadjuvant therapy in patients with breast cancer patients.
Clinical and pathological data from 1143 patients were analyzed, encompassing variables such as age, gender, marital status, histologic grade, T stage, N stage, months from diagnosis to treatment, molecular subtype, and response to neoadjuvant therapy. Seven machine learning models were trained and validated using both internal and external datasets. Model performance was evaluated using multiple metrics, and interpretability analysis was conducted to assess feature importance.
Key variables influencing pCR included grade, N stage, months from diagnosis to treatment, and molecular subtype. The Naive Bayes model emerged as the most effective, with accuracy (0.746), sensitivity (0.699), specificity (0.808), and F1 score (0.759) surpassing other models. Both internal and external validation confirmed the model's robust predictive power. A web tool was developed for clinical use, aiding in personalized treatment planning. Interpretability analysis further elucidated the contribution of features to pCR prediction, enhancing clinical applicability.
The Naive Bayes model provides a robust tool for personalized treatment decisions in patients with breast cancer undergoing neoadjuvant therapy. By accurately predicting pCR rates, it enables clinicians to tailor treatment strategies, potentially improving outcomes.
本研究旨在开发并验证机器学习模型,以预测乳腺癌患者新辅助治疗后的病理完全缓解(pCR)情况。
分析了1143例患者的临床和病理数据,包括年龄、性别、婚姻状况、组织学分级、T分期、N分期、从诊断到治疗的月数、分子亚型以及对新辅助治疗的反应等变量。使用内部和外部数据集对7种机器学习模型进行了训练和验证。使用多种指标评估模型性能,并进行可解释性分析以评估特征重要性。
影响pCR的关键变量包括分级、N分期、从诊断到治疗的月数以及分子亚型。朴素贝叶斯模型表现最为有效,其准确率(0.746)、灵敏度(0.699)、特异性(0.808)和F1分数(0.759)均超过其他模型。内部和外部验证均证实了该模型强大的预测能力。开发了一个临床使用的网络工具,有助于个性化治疗规划。可解释性分析进一步阐明了特征对pCR预测的贡献,增强了临床适用性。
朴素贝叶斯模型为接受新辅助治疗的乳腺癌患者的个性化治疗决策提供了一个强大的工具。通过准确预测pCR率,它使临床医生能够制定个性化治疗策略,可能改善治疗结果。