基于特征子空间划分的微阵列数据分类集成框架。

An ensemble framework for microarray data classification based on feature subspace partitioning.

机构信息

Department of Computer Engineering, Faculty of Engineering, Arak University, Arak, 38156-8-8349, Iran.

出版信息

Comput Biol Med. 2022 Sep;148:105820. doi: 10.1016/j.compbiomed.2022.105820. Epub 2022 Jul 14.

DOI:10.1016/j.compbiomed.2022.105820

Abstract

Feature selection is exposed to the curse of dimensionality risk, and it is even more exacerbated with high-dimensional data such as microarrays. Moreover, the low-instance/high-feature (LIHF) property of microarray data needs considerable processing time to do some calculations and comparisons among features to choose the best subset of them, which has led to many efforts to subdue the LIHF property of such genomic medicine data. Due to the promising results of the ensemble models in machine learning problems, this paper presents a novel framework, named feature-level aggregation-based ensemble based on overlapped feature subspace partitioning (FLAE-OFSP) for microarray data classification. The proposed ensemble has three main steps: after generating several subsets by the proposed partitioning approach, a feature selection algorithm (i.e., a feature ranker) is applied on each subset, and finally, their results are combined into a single ranked list using six defined aggregation functions. Evaluation of the presented framework based on seven microarray datasets and using four measures, including stability, classification accuracy, runtime, and Modscore shows substantial runtime improvement and also quality results in other evaluated measures compared to individual methods.

摘要

特征选择容易受到维度风险的影响，在处理高维数据（如微阵列）时，这种影响更为严重。此外，微阵列数据的低实例/高特征（LIHF）特性需要相当多的处理时间来对特征进行一些计算和比较，以选择最佳的特征子集，这导致了许多努力来抑制此类基因组医学数据的 LIHF 特性。由于集成模型在机器学习问题中取得了有希望的结果，本文提出了一种新的框架，名为基于重叠特征子空间划分的基于特征级聚合的集成（FLAE-OFSP），用于微阵列数据分类。该集成有三个主要步骤：通过提出的划分方法生成几个子集后，将特征选择算法（即特征排名器）应用于每个子集，最后，使用六个定义的聚合函数将它们的结果组合成一个单一的排名列表。基于七个微阵列数据集和四个度量标准（包括稳定性、分类准确性、运行时间和 Modscore）对所提出的框架进行评估，与单个方法相比，该框架在运行时间方面有了实质性的改进，在其他评估度量标准方面也取得了高质量的结果。

相似文献

An ensemble framework for microarray data classification based on feature subspace partitioning.基于特征子空间划分的微阵列数据分类集成框架。

Comput Biol Med. 2022 Sep;148:105820. doi: 10.1016/j.compbiomed.2022.105820. Epub 2022 Jul 14.

Improved intelligent water drop-based hybrid feature selection method for microarray data processing.基于智能水滴的改进型混合特征选择方法在微阵列数据处理中的应用。

Comput Biol Chem. 2023 Apr;103:107809. doi: 10.1016/j.compbiolchem.2022.107809. Epub 2023 Jan 13.

A novel parallel feature rank aggregation algorithm for gene selection applied to microarray data classification.一种应用于微阵列数据分类的基因选择的新型并行特征排序聚合算法。

Comput Biol Chem. 2024 Oct;112:108182. doi: 10.1016/j.compbiolchem.2024.108182. Epub 2024 Aug 24.

Phenotype recognition with combined features and random subspace classifier ensemble.基于组合特征和随机子空间分类器集成的表型识别。

BMC Bioinformatics. 2011 Apr 30;12:128. doi: 10.1186/1471-2105-12-128.

Dual regularized subspace learning using adaptive graph learning and rank constraint: Unsupervised feature selection on gene expression microarray datasets.基于自适应图学习和秩约束的双重正则化子空间学习：基因表达微阵列数据集上的无监督特征选择。

Comput Biol Med. 2023 Dec;167:107659. doi: 10.1016/j.compbiomed.2023.107659. Epub 2023 Nov 4.

Two-stage feature selection for classification of gene expression data based on an improved Salp Swarm Algorithm.基于改进的鹽蝽群算法的基因表达数据分类的两阶段特征选择

Math Biosci Eng. 2022 Sep 19;19(12):13747-13781. doi: 10.3934/mbe.2022641.

Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data.基于微阵列表达数据的稳定生物标志物识别和癌症分类的集成特征选择。

Comput Biol Med. 2022 Mar;142:105208. doi: 10.1016/j.compbiomed.2021.105208. Epub 2022 Jan 5.

A novel Ontology-guided Attribute Partitioning ensemble learning model for early prediction of cognitive deficits using quantitative Structural MRI in very preterm infants.一种新颖的基于本体论的属性划分集成学习模型，用于使用定量结构 MRI 对极早产儿认知缺陷进行早期预测。

Neuroimage. 2022 Oct 15;260:119484. doi: 10.1016/j.neuroimage.2022.119484. Epub 2022 Jul 15.

An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples.一种基于多重过滤和监督属性聚类算法的集成机器学习模型，用于对癌症样本进行分类。

PeerJ Comput Sci. 2021 Sep 16;7:e671. doi: 10.7717/peerj-cs.671. eCollection 2021.

R-HEFS: Rough set based heterogeneous ensemble feature selection method for medical data classification.基于粗糙集的异质集成特征选择方法在医学数据分类中的应用。

Artif Intell Med. 2021 Apr;114:102049. doi: 10.1016/j.artmed.2021.102049. Epub 2021 Mar 6.

引用本文的文献

Detecting microsatellite instability in colorectal cancer using Transformer-based colonoscopy image classification and retrieval.使用基于Transformer的结肠镜检查图像分类和检索来检测结直肠癌中的微卫星不稳定性。

PLoS One. 2024 Jan 25;19(1):e0292277. doi: 10.1371/journal.pone.0292277. eCollection 2024.

Feature Selection and Molecular Classification of Cancer Phenotypes: A Comparative Study.癌症表型的特征选择和分子分类：一项比较研究。

Int J Mol Sci. 2022 Aug 13;23(16):9087. doi: 10.3390/ijms23169087.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于特征子空间划分的微阵列数据分类集成框架。

An ensemble framework for microarray data classification based on feature subspace partitioning.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献