Suppr超能文献

基于小基因组区域(SGA)驱动的特征选择和随机森林分类用于增强乳腺癌诊断:一项对比研究。

SGA-Driven feature selection and random forest classification for enhanced breast cancer diagnosis: A comparative study.

作者信息

Yaqoob Abrar, Verma Navneet Kumar, Mir Mushtaq Ahmad, Tejani Ghanshyam G, Eisa Nashwa Hassan Babiker, Mamoun Hussien Osman Hind, Shah Mohd Asif

机构信息

VIT Bhopal University's School of Advanced Science and Language, Located at Kothrikalan, Sehore, Bhopal, 466114, India.

Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Khalid University, Abha, 61421, Saudi Arabia.

出版信息

Sci Rep. 2025 Mar 30;15(1):10944. doi: 10.1038/s41598-025-95786-1.

Abstract

In this study, we propose a novel approach for breast cancer classification that integrates the Seagull Optimization Algorithm (SGA) for feature selection with the Random Forest (RF) classifier for effective data classification. The novelty of our approach lies in the first-time application of SGA for gene selection in breast cancer diagnosis, where SGA systematically explores the feature space to identify the most informative gene subsets, thereby improving classification accuracy and reducing computational complexity. The selected features are subsequently classified using RF, known for its robustness and high accuracy in handling complex datasets. To evaluate the effectiveness of the proposed method, we compared it with other classifiers, including Linear Regression (LR), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN). The proposed SGA-RF combination achieved a best mean accuracy of 99.01% with 22 genes, outperforming other methods and demonstrating consistent performance across varying feature subsets. The mean accuracies ranged from 85.35 to 94.33%, highlighting a balance between feature reduction and classification accuracy. Future work will explore the integration of other nature-inspired algorithms and deep learning models to further enhance performance and clinical applicability.

摘要

在本研究中,我们提出了一种用于乳腺癌分类的新方法,该方法将用于特征选择的海鸥优化算法(SGA)与用于有效数据分类的随机森林(RF)分类器相结合。我们方法的新颖之处在于首次将SGA应用于乳腺癌诊断中的基因选择,其中SGA系统地探索特征空间以识别信息量最大的基因子集,从而提高分类准确率并降低计算复杂度。随后使用以处理复杂数据集时的稳健性和高精度而闻名的RF对所选特征进行分类。为了评估所提出方法的有效性,我们将其与其他分类器进行了比较,包括线性回归(LR)、支持向量机(SVM)和k近邻(KNN)。所提出的SGA - RF组合使用22个基因实现了99.01%的最佳平均准确率,优于其他方法,并在不同特征子集上表现出一致的性能。平均准确率在85.35%至94.33%之间,突出了特征约简与分类准确率之间的平衡。未来的工作将探索整合其他受自然启发的算法和深度学习模型,以进一步提高性能和临床适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772c/11955515/87ed95dc23dc/41598_2025_95786_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验