使用机器学习方法预测乳腺癌。

Prediction of Breast Cancer using Machine Learning Approaches.

作者信息

Rabiei Reza, Ayyoubzadeh Seyed Mohammad, Sohrabei Solmaz, Esmaeili Marzieh, Atashi Alireza

机构信息

PhD, Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.

PhD, Department of Health Information Technology and Management, School of Allied Medical Sciences, Tehran University of Medical Science, Tehran, Iran.

出版信息

J Biomed Phys Eng. 2022 Jun 1;12(3):297-308. doi: 10.31661/jbpe.v0i0.2109-1403. eCollection 2022 Jun.

DOI:10.31661/jbpe.v0i0.2109-1403

PMID:35698545

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9175124/

Abstract

BACKGROUND

Breast cancer is considered one of the most common cancers in women caused by various clinical, lifestyle, social, and economic factors. Machine learning has the potential to predict breast cancer based on features hidden in data.

OBJECTIVE

This study aimed to predict breast cancer using different machine-learning approaches applying demographic, laboratory, and mammographic data.

MATERIAL AND METHODS

In this analytical study, the database, including 5,178 independent records, 25% of which belonged to breast cancer patients with 24 attributes in each record was obtained from Motamed cancer institute (ACECR), Tehran, Iran. The database contained 5,178 independent records, 25% of which belonged to breast cancer patients containing 24 attributes in each record. The random forest (RF), neural network (MLP), gradient boosting trees (GBT), and genetic algorithms (GA) were used in this study. Models were initially trained with demographic and laboratory features (20 features). The models were then trained with all demographic, laboratory, and mammographic features (24 features) to measure the effectiveness of mammography features in predicting breast cancer.

RESULTS

RF presented higher performance compared to other techniques (accuracy 80%, sensitivity 95%, specificity 80%, and the area under the curve (AUC) 0.56). Gradient boosting (AUC=0.59) showed a stronger performance compared to the neural network.

CONCLUSION

Combining multiple risk factors in modeling for breast cancer prediction could help the early diagnosis of the disease with necessary care plans. Collection, storage, and management of different data and intelligent systems based on multiple factors for predicting breast cancer are effective in disease management.

摘要

背景

乳腺癌被认为是由各种临床、生活方式、社会和经济因素导致的女性最常见癌症之一。机器学习有潜力根据数据中隐藏的特征来预测乳腺癌。

目的

本研究旨在使用不同的机器学习方法，应用人口统计学、实验室和乳腺X线摄影数据来预测乳腺癌。

材料与方法

在这项分析性研究中，数据库包含5178条独立记录，其中25%属于乳腺癌患者，每条记录有24个属性，该数据库来自伊朗德黑兰的莫塔梅德癌症研究所（ACECR）。该数据库包含5178条独立记录，其中25%属于乳腺癌患者，每条记录包含24个属性。本研究使用了随机森林（RF）、神经网络（MLP）、梯度提升树（GBT）和遗传算法（GA）。模型最初使用人口统计学和实验室特征（20个特征）进行训练。然后使用所有人口统计学、实验室和乳腺X线摄影特征（24个特征）对模型进行训练，以测量乳腺X线摄影特征在预测乳腺癌方面的有效性。

结果

与其他技术相比，随机森林表现出更高的性能（准确率80%，灵敏度95%，特异性80%，曲线下面积（AUC）0.56）。梯度提升（AUC = 0.59）与神经网络相比表现出更强的性能。

结论

在乳腺癌预测建模中结合多个风险因素有助于通过必要的护理计划对疾病进行早期诊断。基于多种因素收集、存储和管理不同数据以及智能系统来预测乳腺癌，对疾病管理是有效的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e6b/9175124/ac850168638f/JBPE-12-297-g001.jpg

相似文献

Prediction of Breast Cancer using Machine Learning Approaches.使用机器学习方法预测乳腺癌。

J Biomed Phys Eng. 2022 Jun 1;12(3):297-308. doi: 10.31661/jbpe.v0i0.2109-1403. eCollection 2022 Jun.

Predicting the mortality of patients with Covid-19: A machine learning approach.预测新冠病毒疾病（Covid-19）患者的死亡率：一种机器学习方法。

Health Sci Rep. 2023 Mar 30;6(4):e1162. doi: 10.1002/hsr2.1162. eCollection 2023 Apr.

Breast cancer prediction using different machine learning methods applying multi factors.应用多因素的不同机器学习方法进行乳腺癌预测。

J Cancer Res Clin Oncol. 2023 Dec;149(19):17133-17146. doi: 10.1007/s00432-023-05388-5. Epub 2023 Sep 29.

Machine learning models in breast cancer survival prediction.用于乳腺癌生存预测的机器学习模型。

Technol Health Care. 2016;24(1):31-42. doi: 10.3233/THC-151071.

Development of an Artificial Intelligence-Based Breast Cancer Detection Model by Combining Mammograms and Medical Health Records.通过结合乳房X光照片和医疗健康记录开发基于人工智能的乳腺癌检测模型。

Diagnostics (Basel). 2023 Jan 17;13(3):346. doi: 10.3390/diagnostics13030346.

Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合，以预测放射性肺损伤。

Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.

Predicting Breast Cancer in Chinese Women Using Machine Learning Techniques: Algorithm Development.运用机器学习技术预测中国女性乳腺癌：算法开发

JMIR Med Inform. 2020 Jun 8;8(6):e17364. doi: 10.2196/17364.

Applications of different machine learning approaches in prediction of breast cancer diagnosis delay.不同机器学习方法在乳腺癌诊断延迟预测中的应用

Front Oncol. 2023 Feb 16;13:1103369. doi: 10.3389/fonc.2023.1103369. eCollection 2023.

Application of Artificial Intelligence for Preoperative Diagnostic and Prognostic Prediction in Epithelial Ovarian Cancer Based on Blood Biomarkers.基于血液生物标志物的人工智能在卵巢上皮性癌术前诊断和预后预测中的应用。

Clin Cancer Res. 2019 May 15;25(10):3006-3015. doi: 10.1158/1078-0432.CCR-18-3378. Epub 2019 Apr 11.

Mortality Prediction of Patients With Cardiovascular Disease Using Medical Claims Data Under Artificial Intelligence Architectures: Validation Study.利用人工智能架构下的医疗理赔数据预测心血管疾病患者的死亡率：验证研究

JMIR Med Inform. 2021 Apr 1;9(4):e25000. doi: 10.2196/25000.

引用本文的文献

A comparative analysis of parametric survival models and machine learning methods in breast cancer prognosis.乳腺癌预后中参数生存模型与机器学习方法的比较分析

Sci Rep. 2025 Aug 25;15(1):31288. doi: 10.1038/s41598-025-15696-0.

Machine learning-based classification model to differentiate subtypes of invasive breast cancer using MRI.基于机器学习的分类模型，用于利用磁共振成像鉴别浸润性乳腺癌的亚型

Front Oncol. 2025 Jun 3;15:1588787. doi: 10.3389/fonc.2025.1588787. eCollection 2025.

Enhanced and Interpretable Prediction of Multiple Cancer Types Using a Stacking Ensemble Approach with SHAP Analysis.使用带有SHAP分析的堆叠集成方法对多种癌症类型进行增强且可解释的预测。

Bioengineering (Basel). 2025 Apr 29;12(5):472. doi: 10.3390/bioengineering12050472.

Bayesian Model Prediction for Breast Cancer Survival: A Retrospective Analysis.乳腺癌生存的贝叶斯模型预测：一项回顾性分析。

Eur J Breast Health. 2025 Jun 20;21(3):255-264. doi: 10.4274/ejbh.galenos.2025.2025-2-14. Epub 2025 May 27.

Breast Cancer Detection Using Convolutional Neural Networks: A Deep Learning-Based Approach.使用卷积神经网络进行乳腺癌检测：一种基于深度学习的方法。

Cureus. 2025 May 3;17(5):e83421. doi: 10.7759/cureus.83421. eCollection 2025 May.

Machine learning in predicting infertility treatment success: A systematic literature review of techniques.机器学习在预测不孕症治疗成功率中的应用：技术的系统文献综述

J Educ Health Promot. 2025 Mar 28;14:103. doi: 10.4103/jehp.jehp_1798_23. eCollection 2025.

Image Analysis as tool for Predicting Colorectal Cancer Molecular Alterations: A Scoping Review.图像分析作为预测结直肠癌分子改变的工具：一项范围综述

Mol Imaging Radionucl Ther. 2025 Feb 7;34(1):10-25. doi: 10.4274/mirt.galenos.2024.86402.

A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning.一种基于新型聚集系数排序的特征选择策略，用于利用机器学习增强乳腺癌分类诊断。

Sci Rep. 2025 Feb 4;15(1):4171. doi: 10.1038/s41598-025-87826-7.

Metabolomics-Based Machine Learning Models Accurately Predict Breast Cancer Estrogen Receptor Status.基于代谢组学的机器学习模型可准确预测乳腺癌雌激素受体状态。

Int J Mol Sci. 2024 Dec 4;25(23):13029. doi: 10.3390/ijms252313029.

An open codebase for enhancing transparency in deep learning-based breast cancer diagnosis utilizing CBIS-DDSM data.利用 CBIS-DDSM 数据增强基于深度学习的乳腺癌诊断透明度的开放代码库。

Sci Rep. 2024 Nov 9;14(1):27318. doi: 10.1038/s41598-024-78648-0.

本文引用的文献

A gradient boosting machine learning approach in modeling the impact of temperature and humidity on the transmission rate of COVID-19 in India.一种梯度提升机器学习方法用于模拟温度和湿度对印度新冠病毒传播率的影响。

Appl Intell (Dordr). 2021;51(5):2727-2739. doi: 10.1007/s10489-020-01997-6. Epub 2020 Nov 4.

Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation.通过机器学习方法预测乳腺癌生存率：多重填补法的应用

Iran J Public Health. 2021 Mar;50(3):598-605. doi: 10.18502/ijph.v50i3.5606.

Identifying key factors for the effectiveness of pancreatic cancer screening: A model-based analysis.识别胰腺癌筛查有效性的关键因素：基于模型的分析。

Int J Cancer. 2021 Jul 15;149(2):337-346. doi: 10.1002/ijc.33540. Epub 2021 Mar 25.

Simplified Breast Risk Tool Integrating Questionnaire Risk Factors, Mammographic Density, and Polygenic Risk Score: Development and Validation.简化乳腺风险工具整合问卷风险因素、乳腺密度和多基因风险评分：开发与验证。

Cancer Epidemiol Biomarkers Prev. 2021 Apr;30(4):600-607. doi: 10.1158/1055-9965.EPI-20-0900. Epub 2020 Dec 4.

Modeling and comparing data mining algorithms for prediction of recurrence of breast cancer.建立并比较数据挖掘算法模型以预测乳腺癌的复发。

PLoS One. 2020 Oct 15;15(10):e0237658. doi: 10.1371/journal.pone.0237658. eCollection 2020.

Predicting breast cancer risk using interacting genetic and demographic factors and machine learning.利用交互遗传和人口统计学因素以及机器学习预测乳腺癌风险。

Sci Rep. 2020 Jul 6;10(1):11044. doi: 10.1038/s41598-020-66907-9.

Breast cancer risk after hysterectomy with and without salpingo-oophorectomy for benign indications.良性指征下子宫切除术伴或不伴输卵管卵巢切除术与乳腺癌风险。

Am J Obstet Gynecol. 2020 Dec;223(6):900.e1-900.e7. doi: 10.1016/j.ajog.2020.06.040. Epub 2020 Jun 23.

Predicting Breast Cancer in Chinese Women Using Machine Learning Techniques: Algorithm Development.运用机器学习技术预测中国女性乳腺癌：算法开发

JMIR Med Inform. 2020 Jun 8;8(6):e17364. doi: 10.2196/17364.

Evaluating the impact of soy compounds on breast cancer using the data mining approach.运用数据挖掘方法评估大豆化合物对乳腺癌的影响。

Food Funct. 2020 May 1;11(5):4561-4570. doi: 10.1039/c9fo00976k. Epub 2020 May 13.

Clinical applications of polygenic breast cancer risk: a critical review and perspectives of an emerging field.多基因乳腺癌风险的临床应用：新兴领域的批判性评价与展望。

Breast Cancer Res. 2020 Feb 17;22(1):21. doi: 10.1186/s13058-020-01260-3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用机器学习方法预测乳腺癌。

Prediction of Breast Cancer using Machine Learning Approaches.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

MATERIAL AND METHODS

RESULTS

CONCLUSION

背景

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献