• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于小基因组区域(SGA)驱动的特征选择和随机森林分类用于增强乳腺癌诊断:一项对比研究。

SGA-Driven feature selection and random forest classification for enhanced breast cancer diagnosis: A comparative study.

作者信息

Yaqoob Abrar, Verma Navneet Kumar, Mir Mushtaq Ahmad, Tejani Ghanshyam G, Eisa Nashwa Hassan Babiker, Mamoun Hussien Osman Hind, Shah Mohd Asif

机构信息

VIT Bhopal University's School of Advanced Science and Language, Located at Kothrikalan, Sehore, Bhopal, 466114, India.

Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Khalid University, Abha, 61421, Saudi Arabia.

出版信息

Sci Rep. 2025 Mar 30;15(1):10944. doi: 10.1038/s41598-025-95786-1.

DOI:10.1038/s41598-025-95786-1
PMID:40159513
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11955515/
Abstract

In this study, we propose a novel approach for breast cancer classification that integrates the Seagull Optimization Algorithm (SGA) for feature selection with the Random Forest (RF) classifier for effective data classification. The novelty of our approach lies in the first-time application of SGA for gene selection in breast cancer diagnosis, where SGA systematically explores the feature space to identify the most informative gene subsets, thereby improving classification accuracy and reducing computational complexity. The selected features are subsequently classified using RF, known for its robustness and high accuracy in handling complex datasets. To evaluate the effectiveness of the proposed method, we compared it with other classifiers, including Linear Regression (LR), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN). The proposed SGA-RF combination achieved a best mean accuracy of 99.01% with 22 genes, outperforming other methods and demonstrating consistent performance across varying feature subsets. The mean accuracies ranged from 85.35 to 94.33%, highlighting a balance between feature reduction and classification accuracy. Future work will explore the integration of other nature-inspired algorithms and deep learning models to further enhance performance and clinical applicability.

摘要

在本研究中,我们提出了一种用于乳腺癌分类的新方法,该方法将用于特征选择的海鸥优化算法(SGA)与用于有效数据分类的随机森林(RF)分类器相结合。我们方法的新颖之处在于首次将SGA应用于乳腺癌诊断中的基因选择,其中SGA系统地探索特征空间以识别信息量最大的基因子集,从而提高分类准确率并降低计算复杂度。随后使用以处理复杂数据集时的稳健性和高精度而闻名的RF对所选特征进行分类。为了评估所提出方法的有效性,我们将其与其他分类器进行了比较,包括线性回归(LR)、支持向量机(SVM)和k近邻(KNN)。所提出的SGA - RF组合使用22个基因实现了99.01%的最佳平均准确率,优于其他方法,并在不同特征子集上表现出一致的性能。平均准确率在85.35%至94.33%之间,突出了特征约简与分类准确率之间的平衡。未来的工作将探索整合其他受自然启发的算法和深度学习模型,以进一步提高性能和临床适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/508b063a001c/41598_2025_95786_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/87ed95dc23dc/41598_2025_95786_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/e0ce77769542/41598_2025_95786_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/d1c2c6f011bd/41598_2025_95786_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/baf3fba9006a/41598_2025_95786_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/f8f6ee91eb58/41598_2025_95786_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/d5fcf45750db/41598_2025_95786_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/e9f06e3ed302/41598_2025_95786_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/13199726f97b/41598_2025_95786_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/556f61d29af9/41598_2025_95786_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/abb1140c9498/41598_2025_95786_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/508b063a001c/41598_2025_95786_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/87ed95dc23dc/41598_2025_95786_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/e0ce77769542/41598_2025_95786_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/d1c2c6f011bd/41598_2025_95786_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/baf3fba9006a/41598_2025_95786_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/f8f6ee91eb58/41598_2025_95786_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/d5fcf45750db/41598_2025_95786_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/e9f06e3ed302/41598_2025_95786_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/13199726f97b/41598_2025_95786_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/556f61d29af9/41598_2025_95786_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/abb1140c9498/41598_2025_95786_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/508b063a001c/41598_2025_95786_Fig10_HTML.jpg

相似文献

1
SGA-Driven feature selection and random forest classification for enhanced breast cancer diagnosis: A comparative study.基于小基因组区域(SGA)驱动的特征选择和随机森林分类用于增强乳腺癌诊断:一项对比研究。
Sci Rep. 2025 Mar 30;15(1):10944. doi: 10.1038/s41598-025-95786-1.
2
A novel double machine learning approach for detecting early breast cancer using advanced feature selection and dimensionality reduction techniques.一种使用先进特征选择和降维技术检测早期乳腺癌的新型双机器学习方法。
Sci Rep. 2025 Jul 2;15(1):22971. doi: 10.1038/s41598-025-06426-7.
3
Machine learning for detection of diffusion abnormalities-related respiratory changes among normal, overweight, and obese individuals based on BMI and pulmonary ventilation parameters: an observational study.基于BMI和肺通气参数,利用机器学习检测正常、超重和肥胖个体中与扩散异常相关的呼吸变化:一项观察性研究。
BMC Med Inform Decis Mak. 2025 Jul 1;25(1):240. doi: 10.1186/s12911-025-03064-x.
4
Optimizing cancer diagnosis: A hybrid approach of genetic operators and Sinh Cosh Optimizer for tumor identification and feature gene selection.优化癌症诊断:遗传算子和 Sinh Cosh 优化器的混合方法用于肿瘤识别和特征基因选择。
Comput Biol Med. 2024 Sep;180:108984. doi: 10.1016/j.compbiomed.2024.108984. Epub 2024 Aug 10.
5
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
6
Stacked Ensemble Learning for Classification of Parkinson's Disease Using Telemonitoring Vocal Features.基于远程监测语音特征的帕金森病分类堆叠集成学习
Diagnostics (Basel). 2025 Jun 9;15(12):1467. doi: 10.3390/diagnostics15121467.
7
Proposal for Using AI to Assess Clinical Data Integrity and Generate Metadata: Algorithm Development and Validation.关于使用人工智能评估临床数据完整性并生成元数据的提案:算法开发与验证
JMIR Med Inform. 2025 Jun 30;13:e60204. doi: 10.2196/60204.
8
XGB-BIF: An XGBoost-Driven Biomarker Identification Framework for Detecting Cancer Using Human Genomic Data.XGB-BIF:一种用于利用人类基因组数据检测癌症的基于XGBoost的生物标志物识别框架。
Int J Mol Sci. 2025 Jun 11;26(12):5590. doi: 10.3390/ijms26125590.
9
The Application of Machine Learning Algorithms to Predict HIV Testing Using Evidence from the 2002-2017 South African Adult Population-Based Surveys: An HIV Testing Predictive Model.运用机器学习算法,根据2002 - 2017年南非基于成人人口的调查数据预测HIV检测情况:一种HIV检测预测模型
Trop Med Infect Dis. 2025 Jun 14;10(6):167. doi: 10.3390/tropicalmed10060167.
10
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.

引用本文的文献

1
Fusing wrist pulse and ECG data for enhanced identification of coronary heart disease and its complications.融合手腕脉搏和心电图数据以增强对冠心病及其并发症的识别。
Front Physiol. 2025 Jul 29;16:1628309. doi: 10.3389/fphys.2025.1628309. eCollection 2025.
2
AI driven automation for enhancing sustainability efforts in CDP report analysis.人工智能驱动的自动化,用于加强CDP报告分析中的可持续发展努力。
Sci Rep. 2025 Jul 7;15(1):24266. doi: 10.1038/s41598-025-07584-4.
3
GNNs surpass transformers in tumor medical image segmentation.在肿瘤医学图像分割方面,图神经网络(GNNs)优于变换器(transformers)。

本文引用的文献

1
Transforming Cancer Classification: The Role of Advanced Gene Selection.转变癌症分类:先进基因选择的作用。
Diagnostics (Basel). 2024 Nov 22;14(23):2632. doi: 10.3390/diagnostics14232632.
2
RNA-Seq analysis for breast cancer detection: a study on paired tissue samples using hybrid optimization and deep learning techniques.RNA-Seq 分析在乳腺癌检测中的应用:基于混合优化和深度学习技术的配对组织样本研究。
J Cancer Res Clin Oncol. 2024 Oct 10;150(10):455. doi: 10.1007/s00432-024-05968-z.
3
Optimizing cancer classification: a hybrid RDO-XGBoost approach for feature selection and predictive insights.
Sci Rep. 2025 Jun 5;15(1):19842. doi: 10.1038/s41598-025-00002-9.
优化癌症分类:一种用于特征选择和预测洞察的混合 RDO-XGBoost 方法。
Cancer Immunol Immunother. 2024 Oct 9;73(12):261. doi: 10.1007/s00262-024-03843-x.
4
Bobcat Optimization Algorithm: an effective bio-inspired metaheuristic algorithm for solving supply chain optimization problems.山猫优化算法:一种用于解决供应链优化问题的有效的受生物启发的元启发式算法。
Sci Rep. 2024 Aug 29;14(1):20099. doi: 10.1038/s41598-024-70497-1.
5
Breast cancer diagnosis using support vector machine optimized by improved quantum inspired grey wolf optimization.基于改进量子灰狼优化算法优化支持向量机的乳腺癌诊断
Sci Rep. 2024 May 10;14(1):10714. doi: 10.1038/s41598-024-61322-w.
6
Discovering biomarkers associated and predicting cardiovascular disease with high accuracy using a novel nexus of machine learning techniques for precision medicine.利用机器学习技术的新型融合,准确发现与心血管疾病相关的生物标志物并进行预测,为精准医疗提供支持。
Sci Rep. 2024 Jan 2;14(1):1. doi: 10.1038/s41598-023-50600-8.
7
A bio-inspired convolution neural network architecture for automatic breast cancer detection and classification using RNA-Seq gene expression data.基于生物启发的卷积神经网络架构,用于使用 RNA-Seq 基因表达数据进行自动乳腺癌检测和分类。
Sci Rep. 2023 Sep 5;13(1):14644. doi: 10.1038/s41598-023-41731-z.
8
Biological insights and novel biomarker discovery through deep learning approaches in breast cancer histopathology.通过深度学习方法在乳腺癌组织病理学中获得生物学见解和发现新型生物标志物
NPJ Breast Cancer. 2023 Apr 6;9(1):21. doi: 10.1038/s41523-023-00518-1.
9
A phase II study of palbociclib plus letrozole plus trastuzumab as neoadjuvant treatment for clinical stages II and III ER+ HER2+ breast cancer (PALTAN).一项关于帕博西尼联合来曲唑加曲妥珠单抗作为新辅助治疗临床II期和III期ER+HER2+乳腺癌的II期研究(PALTAN)。
NPJ Breast Cancer. 2023 Jan 6;9(1):1. doi: 10.1038/s41523-022-00504-z.
10
Prisoner's dilemma game model Based on historical strategy information.基于历史策略信息的囚徒困境博弈模型。
Sci Rep. 2023 Jan 2;13(1):1. doi: 10.1038/s41598-022-26890-9.