一种用于慢性人类疾病预测的特征选择的增强型高效方法：一项乳腺癌研究。

An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study.

作者信息

Khanna Munish, Singh Law Kumar, Shrivastava Kapil, Singh Rekha

机构信息

School of Computing Science and Engineering, Galgotias University, Greater Noida, Gautam Buddh Nagar, India.

Department of Computer Engineering and Applications, GLA University, Mathura, India.

出版信息

Heliyon. 2024 Feb 28;10(5):e26799. doi: 10.1016/j.heliyon.2024.e26799. eCollection 2024 Mar 15.

DOI:10.1016/j.heliyon.2024.e26799

PMID:38463826

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10920178/

Abstract

Computer-aided diagnosis (CAD) systems play a vital role in modern research by effectively minimizing both time and costs. These systems support healthcare professionals like radiologists in their decision-making process by efficiently detecting abnormalities as well as offering accurate and dependable information. These systems heavily depend on the efficient selection of features to accurately categorize high-dimensional biological data. These features can subsequently assist in the diagnosis of related medical conditions. The task of identifying patterns in biomedical data can be quite challenging due to the presence of numerous irrelevant or redundant features. Therefore, it is crucial to propose and then utilize a feature selection (FS) process in order to eliminate these features. The primary goal of FS approaches is to improve the accuracy of classification by eliminating features that are irrelevant or less informative. The FS phase plays a critical role in attaining optimal results in machine learning (ML)-driven CAD systems. The effectiveness of ML models can be significantly enhanced by incorporating efficient features during the training phase. This empirical study presents a methodology for the classification of biomedical data using the FS technique. The proposed approach incorporates three soft computing-based optimization algorithms, namely Teaching Learning-Based Optimization (TLBO), Elephant Herding Optimization (EHO), and a proposed hybrid algorithm of these two. These algorithms were previously employed; however, their effectiveness in addressing FS issues in predicting human diseases has not been investigated. The following evaluation focuses on the categorization of benign and malignant tumours using the publicly available Wisconsin Diagnostic Breast Cancer (WDBC) benchmark dataset. The five-fold cross-validation technique is employed to mitigate the risk of over-fitting. The evaluation of the proposed approach's proficiency is determined based on several metrics, including sensitivity, specificity, precision, accuracy, area under the receiver-operating characteristic curve (AUC), and F1-score. The best value of accuracy computed through the suggested approach is 97.96%. The proposed clinical decision support system demonstrates a highly favourable classification performance outcome, making it a valuable tool for medical practitioners to utilize as a secondary opinion and reducing the overburden of expert medical practitioners.

摘要

计算机辅助诊断（CAD）系统通过有效减少时间和成本，在现代研究中发挥着至关重要的作用。这些系统通过高效检测异常以及提供准确可靠的信息，支持放射科医生等医疗专业人员进行决策。这些系统严重依赖于特征的有效选择，以便对高维生物数据进行准确分类。这些特征随后可协助诊断相关疾病。由于存在大量无关或冗余特征，在生物医学数据中识别模式的任务可能颇具挑战性。因此，提出并利用特征选择（FS）过程以消除这些特征至关重要。FS方法的主要目标是通过消除无关或信息较少的特征来提高分类的准确性。FS阶段在机器学习（ML）驱动的CAD系统中获得最优结果方面起着关键作用。在训练阶段纳入有效特征可显著提高ML模型的有效性。本实证研究提出了一种使用FS技术对生物医学数据进行分类的方法。所提出的方法纳入了三种基于软计算的优化算法，即基于教学学习的优化（TLBO）、大象群聚优化（EHO）以及这两者的一种混合算法。这些算法此前已被采用；然而，它们在解决预测人类疾病的FS问题方面的有效性尚未得到研究。以下评估聚焦于使用公开可用的威斯康星诊断乳腺癌（WDBC）基准数据集对良性和恶性肿瘤进行分类。采用五折交叉验证技术来降低过拟合风险。基于包括灵敏度、特异性、精确度、准确度、受试者工作特征曲线下面积（AUC）和F1分数在内的多个指标，确定所提出方法的熟练度评估。通过所建议方法计算出的最佳准确度值为97.96%。所提出的临床决策支持系统展现出极为良好的分类性能结果，使其成为医学从业者用作第二意见的宝贵工具，并减轻了专家医学从业者的负担。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a17/10920178/f397b3a3fe69/gr1.jpg

相似文献

An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study.

Heliyon. 2024 Feb 28;10(5):e26799. doi: 10.1016/j.heliyon.2024.e26799. eCollection 2024 Mar 15.

Enhancing Breast Cancer Detection and Classification Using Advanced Multi-Model Features and Ensemble Machine Learning Techniques.

Life (Basel). 2023 Oct 21;13(10):2093. doi: 10.3390/life13102093.

Application of information theoretic feature selection and machine learning methods for the development of genetic risk prediction models.

Sci Rep. 2021 Dec 2;11(1):23335. doi: 10.1038/s41598-021-00854-x.

Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network & LSTM model.

Front Med (Lausanne). 2024 Jun 20;11:1414637. doi: 10.3389/fmed.2024.1414637. eCollection 2024.

Intelligent Machine Learning Approach for Effective Recognition of Diabetes in E-Healthcare Using Clinical Data.

Sensors (Basel). 2020 May 6;20(9):2649. doi: 10.3390/s20092649.

Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model.

Int J Comput Assist Radiol Surg. 2014 Nov;9(6):1005-20. doi: 10.1007/s11548-014-0992-1. Epub 2014 Mar 25.

Performance assessment of hybrid machine learning approaches for breast cancer and recurrence prediction.

PLoS One. 2024 Aug 1;19(8):e0304768. doi: 10.1371/journal.pone.0304768. eCollection 2024.

Prediction of diabetes disease using an ensemble of machine learning multi-classifier models.

BMC Bioinformatics. 2023 Sep 12;24(1):337. doi: 10.1186/s12859-023-05465-z.

Differentiation of fat-poor angiomyolipoma from clear cell renal cell carcinoma in contrast-enhanced MDCT images using quantitative feature classification.

Med Phys. 2017 Jul;44(7):3604-3614. doi: 10.1002/mp.12258. Epub 2017 Jun 9.

Correlation-Based Ensemble Feature Selection Using Bioinspired Algorithms and Classification Using Backpropagation Neural Network.

Comput Math Methods Med. 2019 Sep 23;2019:7398307. doi: 10.1155/2019/7398307. eCollection 2019.

引用本文的文献

Radiomics early assessment of post chemotherapy cardiotoxicity in cancer patients using 2D echocardiography imaging an interpretable machine learning study.

Sci Rep. 2025 Aug 22;15(1):30888. doi: 10.1038/s41598-025-02687-4.

Comprehensive framework for thyroid disorder diagnosis: Integrating advanced feature selection, genetic algorithms, and machine learning for enhanced accuracy and other performance matrices.

PLoS One. 2025 Jun 18;20(6):e0325900. doi: 10.1371/journal.pone.0325900. eCollection 2025.

Machine learning driven biomarker selection for medical diagnosis.

PLoS One. 2025 Jun 11;20(6):e0322620. doi: 10.1371/journal.pone.0322620. eCollection 2025.

Assessing ML classification algorithms and NLP techniques for depression detection: An experimental case study.

PLoS One. 2025 May 28;20(5):e0322299. doi: 10.1371/journal.pone.0322299. eCollection 2025.

Ensemble-based multiclass lung cancer classification using hybrid CNN-SVD feature extraction and selection method.

PLoS One. 2025 Mar 19;20(3):e0318219. doi: 10.1371/journal.pone.0318219. eCollection 2025.

Algorithm, expert, or both? Evaluating the role of feature selection methods on user preferences and reliance.

PLoS One. 2025 Mar 7;20(3):e0318874. doi: 10.1371/journal.pone.0318874. eCollection 2025.

Utilizing Feature Selection Techniques for AI-Driven Tumor Subtype Classification: Enhancing Precision in Cancer Diagnostics.

Biomolecules. 2025 Jan 8;15(1):81. doi: 10.3390/biom15010081.

Refining breast cancer classification: Customized attention integration approaches with dense and residual networks for enhanced detection.

Digit Health. 2025 Jan 6;11:20552076241309947. doi: 10.1177/20552076241309947. eCollection 2025 Jan-Dec.

DRSegNet: A cutting-edge approach to Diabetic Retinopathy segmentation and classification using parameter-aware Nature-Inspired optimization.

PLoS One. 2024 Dec 5;19(12):e0312016. doi: 10.1371/journal.pone.0312016. eCollection 2024.

A machine learning approach to determine the risk factors for fall in multiple sclerosis.

BMC Med Inform Decis Mak. 2024 Jul 30;24(1):215. doi: 10.1186/s12911-024-02621-0.

本文引用的文献

Binary Simulated Normal Distribution Optimizer for feature selection: Theory and application in COVID-19 datasets.

Expert Syst Appl. 2022 Aug 15;200:116834. doi: 10.1016/j.eswa.2022.116834. Epub 2022 Mar 15.

Recent advancement in cancer diagnosis using machine learning and deep learning techniques: A comprehensive review.

Comput Biol Med. 2022 Jul;146:105580. doi: 10.1016/j.compbiomed.2022.105580. Epub 2022 May 5.

Breast cancer disease classification using fuzzy-ID3 algorithm with FUZZYDBD method: automatic fuzzy database definition.

PeerJ Comput Sci. 2021 May 4;7:e427. doi: 10.7717/peerj-cs.427. eCollection 2021.

Real‑time COVID-19 diagnosis from X-Ray images using deep CNN and extreme learning machines stabilized by chimp optimization algorithm.

Biomed Signal Process Control. 2021 Jul;68:102764. doi: 10.1016/j.bspc.2021.102764. Epub 2021 May 11.

Evolving deep convolutional neutral network by hybrid sine-cosine and extreme learning machine for real-time COVID19 diagnosis from X-ray images.

Soft comput. 2023;27(6):3307-3326. doi: 10.1007/s00500-021-05839-6. Epub 2021 May 10.

COVID-19 Pandemic: ARIMA and Regression Model-Based Worldwide Death Cases Predictions.

SN Comput Sci. 2020;1(5):288. doi: 10.1007/s42979-020-00298-6. Epub 2020 Aug 29.

Detection and classification of breast cancer using logistic regression feature selection and GMDH classifier.

J Biomed Inform. 2020 Nov;111:103591. doi: 10.1016/j.jbi.2020.103591. Epub 2020 Oct 8.

Analysis of Decision Tree and K-Nearest Neighbor Algorithm in the Classification of Breast Cancer.

Asian Pac J Cancer Prev. 2019 Dec 1;20(12):3777-3781. doi: 10.31557/APJCP.2019.20.12.3777.

Breast Cancer Treatment: A Review.

JAMA. 2019 Jan 22;321(3):288-300. doi: 10.1001/jama.2018.19323.

Alarming Burden of Triple-Negative Breast Cancer in India.

Clin Breast Cancer. 2018 Jun;18(3):e393-e399. doi: 10.1016/j.clbc.2017.07.013. Epub 2017 Jul 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于慢性人类疾病预测的特征选择的增强型高效方法：一项乳腺癌研究。

An enhanced and efficient approach for feature selection for chronic human disease prediction: A breast cancer study.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献