一种用于生物标志物发现的集成特征选择方法。

An Ensemble Feature Selection Method for Biomarker Discovery.

作者信息

Shahrjooihaghighi Aliasghar, Frigui Hichem, Zhang Xiang, Wei Xiaoli, Shi Biyun, Trabelsi Ameni

机构信息

Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY 40292, USA.

Department of Chemistry, University of Louisville, Louisville, KY 40292, USA.

出版信息

Proc IEEE Int Symp Signal Proc Inf Tech. 2017 Dec;2017:416-421. doi: 10.1109/ISSPIT.2017.8388679. Epub 2018 Jun 21.

DOI:10.1109/ISSPIT.2017.8388679

PMID:30887013

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6420823/

Abstract

Feature selection in Liquid Chromatography-Mass Spectrometry (LC-MS)-based metabolomics data (biomarker discovery) have become an important topic for machine learning researchers. High dimensionality and small sample size of LC-MS data make feature selection a challenging task. The goal of biomarker discovery is to select the few most discriminative features among a large number of irreverent ones. To improve the reliability of the discovered biomarkers, we use an ensemble-based approach. Ensemble learning can improve the accuracy of feature selection by combining multiple algorithms that have complementary information. In this paper, we propose an ensemble approach to combine the results of filter-based feature selection methods. To evaluate the proposed approach, we compared it to two commonly used methods, t-test and PLS-DA, using a real data set.

摘要

基于液相色谱-质谱联用（LC-MS）的代谢组学数据中的特征选择（生物标志物发现）已成为机器学习研究人员的一个重要课题。LC-MS数据的高维度和小样本量使得特征选择成为一项具有挑战性的任务。生物标志物发现的目标是在大量无关特征中选择少数最具判别力的特征。为了提高所发现生物标志物的可靠性，我们使用一种基于集成的方法。集成学习可以通过组合具有互补信息的多种算法来提高特征选择的准确性。在本文中，我们提出了一种集成方法来组合基于过滤的特征选择方法的结果。为了评估所提出的方法，我们使用一个真实数据集将其与两种常用方法t检验和偏最小二乘判别分析（PLS-DA）进行了比较。

相似文献

An Ensemble Feature Selection Method for Biomarker Discovery.

Proc IEEE Int Symp Signal Proc Inf Tech. 2017 Dec;2017:416-421. doi: 10.1109/ISSPIT.2017.8388679. Epub 2018 Jun 21.

A two-stage hybrid biomarker selection method based on ensemble filter and binary differential evolution incorporating binary African vultures optimization.

BMC Bioinformatics. 2023 Apr 4;24(1):130. doi: 10.1186/s12859-023-05247-7.

Stable feature selection based on the ensemble L -norm support vector machine for biomarker discovery.

BMC Genomics. 2016 Dec 22;17(Suppl 13):1026. doi: 10.1186/s12864-016-3320-z.

Filter and Wrapper Stacking Ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data.

Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad382.

A critical assessment of feature selection methods for biomarker discovery in clinical proteomics.

Mol Cell Proteomics. 2013 Jan;12(1):263-76. doi: 10.1074/mcp.M112.022566. Epub 2012 Oct 31.

Ensemble feature selection with data-driven thresholding for Alzheimer's disease biomarker discovery.

BMC Bioinformatics. 2023 Jan 9;24(1):9. doi: 10.1186/s12859-022-05132-9.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification

Robustness of chemometrics-based feature selection methods in early cancer detection and biomarker discovery.

Stat Appl Genet Mol Biol. 2013 Mar 13;12(2):207-23. doi: 10.1515/sagmb-2012-0067.

Identification of most important features based on a fuzzy ensemble technique: Evaluation on joint space narrowing progression in knee osteoarthritis patients.

Int J Med Inform. 2021 Dec;156:104614. doi: 10.1016/j.ijmedinf.2021.104614. Epub 2021 Oct 11.

Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data.

Comput Biol Med. 2022 Mar;142:105208. doi: 10.1016/j.compbiomed.2021.105208. Epub 2022 Jan 5.

引用本文的文献

Double-Weighted Bayesian Model Combination for Metabolomics Data Description and Prediction.

Metabolites. 2025 Mar 21;15(4):214. doi: 10.3390/metabo15040214.

miRDM-rfGA: Genetic algorithm-based identification of a miRNA set for detecting type 2 diabetes.

BMC Med Genomics. 2023 Aug 22;16(1):195. doi: 10.1186/s12920-023-01636-2.

Skin-Interfaced Wearable Sweat Sensors for Precision Medicine.

Chem Rev. 2023 Apr 26;123(8):5049-5138. doi: 10.1021/acs.chemrev.2c00823. Epub 2023 Mar 27.

Precision Medicine Approaches with Metabolomics and Artificial Intelligence.

Int J Mol Sci. 2022 Sep 24;23(19):11269. doi: 10.3390/ijms231911269.

Exploring Factors That Affected Student Well-Being during the COVID-19 Pandemic: A Comparison of Data-Mining Approaches.

Int J Environ Res Public Health. 2022 Sep 7;19(18):11267. doi: 10.3390/ijerph191811267.

A Computational Approach to Identification of Candidate Biomarkers in High-Dimensional Molecular Data.

Diagnostics (Basel). 2022 Aug 18;12(8):1997. doi: 10.3390/diagnostics12081997.

Novel feature selection methods for construction of accurate epigenetic clocks.

PLoS Comput Biol. 2022 Aug 19;18(8):e1009938. doi: 10.1371/journal.pcbi.1009938. eCollection 2022 Aug.

Lung cancer survival prediction and biomarker identification with an ensemble machine learning analysis of tumor core biopsy metabolomic data.

Metabolomics. 2022 Jul 20;18(8):57. doi: 10.1007/s11306-022-01918-3.

Supervised Methods for Biomarker Detection from Microarray Experiments.

Methods Mol Biol. 2022;2401:101-120. doi: 10.1007/978-1-0716-1839-4_8.

Classification and Automated Interpretation of Spinal Posture Data Using a Pathology-Independent Classifier and Explainable Artificial Intelligence (XAI).

Sensors (Basel). 2021 Sep 21;21(18):6323. doi: 10.3390/s21186323.

本文引用的文献

Stable feature selection based on the ensemble L -norm support vector machine for biomarker discovery.

BMC Genomics. 2016 Dec 22;17(Suppl 13):1026. doi: 10.1186/s12864-016-3320-z.

Feature Selection Methods for Early Predictive Biomarker Discovery Using Untargeted Metabolomic Data.

Front Mol Biosci. 2016 Jul 8;3:30. doi: 10.3389/fmolb.2016.00030. eCollection 2016.

A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data.

Adv Bioinformatics. 2015;2015:198363. doi: 10.1155/2015/198363. Epub 2015 Jun 11.

Statistical analysis and modeling of mass spectrometry-based metabolomics data.

Methods Mol Biol. 2014;1198:333-53. doi: 10.1007/978-1-4939-1258-2_22.

A critical assessment of feature selection methods for biomarker discovery in clinical proteomics.

Mol Cell Proteomics. 2013 Jan;12(1):263-76. doi: 10.1074/mcp.M112.022566. Epub 2012 Oct 31.

A survey on filter techniques for feature selection in gene expression microarray analysis.

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1106-19. doi: 10.1109/TCBB.2012.33.

Robust biomarker identification for cancer diagnosis with ensemble feature selection methods.

Bioinformatics. 2010 Feb 1;26(3):392-8. doi: 10.1093/bioinformatics/btp630. Epub 2009 Nov 25.

A two-stage feature selection method for gene expression data.

OMICS. 2009 Apr;13(2):127-37. doi: 10.1089/omi.2008.0083.

Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments.

BMC Bioinformatics. 2008 Oct 3;9:410. doi: 10.1186/1471-2105-9-410.

A review of feature selection techniques in bioinformatics.

Bioinformatics. 2007 Oct 1;23(19):2507-17. doi: 10.1093/bioinformatics/btm344. Epub 2007 Aug 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于生物标志物发现的集成特征选择方法。

An Ensemble Feature Selection Method for Biomarker Discovery.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献