Suppr超能文献

优化癌症分类:一种用于特征选择和预测洞察的混合 RDO-XGBoost 方法。

Optimizing cancer classification: a hybrid RDO-XGBoost approach for feature selection and predictive insights.

机构信息

VIT Bhopal University's School of Advanced Science and Language, Located at Kothrikalan, Sehore, Bhopal, 466114, India.

Planning Department, State Planning Institute (New Division), Lucknow, Utter Pradesh, 226001, India.

出版信息

Cancer Immunol Immunother. 2024 Oct 9;73(12):261. doi: 10.1007/s00262-024-03843-x.

Abstract

The identification of relevant biomarkers from high-dimensional cancer data remains a significant challenge due to the complexity and heterogeneity inherent in various cancer types. Conventional feature selection methods often struggle to effectively navigate the vast solution space while maintaining high predictive accuracy. In response to these challenges, we introduce a novel feature selection approach that integrates Random Drift Optimization (RDO) with XGBoost, specifically designed to enhance the performance of cancer classification tasks. Our proposed framework not only improves classification accuracy but also offers valuable insights into the underlying biological mechanisms driving cancer progression. Through comprehensive experiments conducted on real-world cancer datasets, including Central Nervous System (CNS), Leukemia, Breast, and Ovarian cancers, we demonstrate the efficacy of our method in identifying a smaller subset of unique and relevant genes. This selection results in significantly improved classification efficiency and accuracy. When compared with popular classifiers such as Support Vector Machine, K-Nearest Neighbor, and Naive Bayes, our approach consistently outperforms these models in terms of both accuracy and F-measure metrics. For instance, our framework achieved an accuracy of 97.24% in the CNS dataset, 99.14% in Leukemia, 95.21% in Ovarian, and 87.62% in Breast cancer, showcasing its robustness and effectiveness across different types of cancer data. These results underline the potential of our RDO-XGBoost framework as a promising solution for feature selection in cancer data analysis, offering enhanced predictive performance and valuable biological insights.

摘要

由于各种癌症类型固有的复杂性和异质性,从高维癌症数据中识别相关生物标志物仍然是一个重大挑战。传统的特征选择方法在有效导航广阔的解决方案空间的同时,往往难以保持高预测准确性。针对这些挑战,我们引入了一种新的特征选择方法,将随机漂移优化(RDO)与 XGBoost 相结合,专门设计用于提高癌症分类任务的性能。我们提出的框架不仅提高了分类准确性,还为驱动癌症进展的潜在生物学机制提供了有价值的见解。通过对包括中枢神经系统(CNS)、白血病、乳腺癌和卵巢癌在内的真实癌症数据集进行全面实验,我们证明了我们的方法在识别更小的独特相关基因子集方面的有效性。这种选择导致分类效率和准确性显著提高。与支持向量机、K-最近邻和朴素贝叶斯等流行的分类器相比,我们的方法在准确性和 F 度量方面始终优于这些模型。例如,我们的框架在 CNS 数据集上实现了 97.24%的准确率,在白血病中达到了 99.14%,在卵巢癌中为 95.21%,在乳腺癌中为 87.62%,展示了其在不同类型癌症数据中的稳健性和有效性。这些结果强调了我们的 RDO-XGBoost 框架作为癌症数据分析中特征选择的有前途的解决方案的潜力,提供了增强的预测性能和有价值的生物学见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5720/11464649/1dbbf407d4a4/262_2024_3843_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验