一种用于基因表达数据分类的组合特征选择与集成神经网络方法。

A combinational feature selection and ensemble neural network method for classification of gene expression data.

作者信息

Liu Bing, Cui Qinghua, Jiang Tianzi, Ma Songde

机构信息

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, P. R. China.

出版信息

BMC Bioinformatics. 2004 Sep 27;5:136. doi: 10.1186/1471-2105-5-136.

DOI:10.1186/1471-2105-5-136

PMID:15450124

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC522806/

Abstract

BACKGROUND

Microarray experiments are becoming a powerful tool for clinical diagnosis, as they have the potential to discover gene expression patterns that are characteristic for a particular disease. To date, this problem has received most attention in the context of cancer research, especially in tumor classification. Various feature selection methods and classifier design strategies also have been generally used and compared. However, most published articles on tumor classification have applied a certain technique to a certain dataset, and recently several researchers compared these techniques based on several public datasets. But, it has been verified that differently selected features reflect different aspects of the dataset and some selected features can obtain better solutions on some certain problems. At the same time, faced with a large amount of microarray data with little knowledge, it is difficult to find the intrinsic characteristics using traditional methods. In this paper, we attempt to introduce a combinational feature selection method in conjunction with ensemble neural networks to generally improve the accuracy and robustness of sample classification.

RESULTS

We validate our new method on several recent publicly available datasets both with predictive accuracy of testing samples and through cross validation. Compared with the best performance of other current methods, remarkably improved results can be obtained using our new strategy on a wide range of different datasets.

CONCLUSIONS

Thus, we conclude that our methods can obtain more information in microarray data to get more accurate classification and also can help to extract the latent marker genes of the diseases for better diagnosis and treatment.

摘要

背景

微阵列实验正成为临床诊断的有力工具，因为它们有潜力发现特定疾病所特有的基因表达模式。迄今为止，这个问题在癌症研究领域，尤其是肿瘤分类方面受到了最多关注。各种特征选择方法和分类器设计策略也已被普遍使用和比较。然而，大多数已发表的关于肿瘤分类的文章都将某种技术应用于某个特定数据集，最近一些研究人员基于几个公共数据集对这些技术进行了比较。但是，已经证实不同选择的特征反映了数据集的不同方面，并且一些选择的特征在某些特定问题上可以获得更好的解决方案。同时，面对大量几乎没有相关知识的微阵列数据，使用传统方法很难找到其内在特征。在本文中，我们尝试引入一种结合集成神经网络的组合特征选择方法，以普遍提高样本分类的准确性和鲁棒性。

结果

我们在几个最近公开可用的数据集上验证了我们的新方法，既通过测试样本的预测准确性，也通过交叉验证。与其他当前方法的最佳性能相比，使用我们的新策略在广泛的不同数据集上可以获得显著改进的结果。

结论

因此，我们得出结论，我们的方法可以在微阵列数据中获取更多信息以实现更准确的分类，还可以帮助提取疾病的潜在标记基因，以实现更好的诊断和治疗。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94ca/522806/cb184bfcc337/1471-2105-5-136-1.jpg

相似文献

A combinational feature selection and ensemble neural network method for classification of gene expression data.

BMC Bioinformatics. 2004 Sep 27;5:136. doi: 10.1186/1471-2105-5-136.

A Ranking Approach for Probe Selection and Classification of Microarray Data with Artificial Neural Networks.

J Comput Biol. 2015 Oct;22(10):953-61. doi: 10.1089/cmb.2013.0125.

Support vector machine classification and validation of cancer tissue samples using microarray expression data.

Bioinformatics. 2000 Oct;16(10):906-14. doi: 10.1093/bioinformatics/16.10.906.

Iterative class discovery and feature selection using Minimal Spanning Trees.

BMC Bioinformatics. 2004 Sep 8;5:126. doi: 10.1186/1471-2105-5-126.

Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data.

Artif Intell Med. 2011 Sep;53(1):47-56. doi: 10.1016/j.artmed.2011.06.008. Epub 2011 Jul 19.

A strategy for oligonucleotide microarray probe reduction.

Genome Biol. 2002;3(12):RESEARCH0073. doi: 10.1186/gb-2002-3-12-research0073. Epub 2002 Nov 25.

Expression profiling targeting chromosomes for tumor classification and prediction of clinical behavior.

Genes Chromosomes Cancer. 2003 Nov;38(3):207-14. doi: 10.1002/gcc.10276.

Bayesian automatic relevance determination algorithms for classifying gene expression data.

Bioinformatics. 2002 Oct;18(10):1332-9. doi: 10.1093/bioinformatics/18.10.1332.

Unsupervised clustering in mRNA expression profiles.

Comput Biol Med. 2006 Oct;36(10):1126-42. doi: 10.1016/j.compbiomed.2005.09.003. Epub 2005 Oct 24.

Accurate molecular classification of cancer using simple rules.

BMC Med Genomics. 2009 Oct 30;2:64. doi: 10.1186/1755-8794-2-64.

引用本文的文献

Hybrid classical and quantum computing for enhanced glioma tumor classification using TCGA data.

Sci Rep. 2025 Jul 17;15(1):25935. doi: 10.1038/s41598-025-97067-3.

Network-based multi-class classifier to identify optimized gene networks for acute leukemia cell line classification.

PLoS One. 2025 May 8;20(5):e0321549. doi: 10.1371/journal.pone.0321549. eCollection 2025.

SVM-DO: identification of tumor-discriminating mRNA signatures via support vector machines supported by Disease Ontology.

Turk J Biol. 2023 Dec 14;47(6):349-365. doi: 10.55730/1300-0152.2670. eCollection 2023.

Evaluation of penalized and machine learning methods for asthma disease prediction in the Korean Genome and Epidemiology Study (KoGES).

BMC Bioinformatics. 2024 Feb 2;25(1):56. doi: 10.1186/s12859-024-05677-x.

Genetic Risk Assessment of Nonsyndromic Cleft Lip with or without Cleft Palate by Linking Genetic Networks and Deep Learning Models.

Int J Mol Sci. 2023 Feb 25;24(5):4557. doi: 10.3390/ijms24054557.

The Prediction of Peritoneal Carcinomatosis in Patients with Colorectal Cancer Using Machine Learning.

Healthcare (Basel). 2022 Jul 29;10(8):1425. doi: 10.3390/healthcare10081425.

Status quo and future prospects of artificial neural network from the perspective of gastroenterologists.

World J Gastroenterol. 2021 Jun 7;27(21):2681-2709. doi: 10.3748/wjg.v27.i21.2681.

Gene expression feature selection for prostate cancer diagnosis using a two-phase heuristic-deterministic search strategy.

IET Syst Biol. 2018 Aug;12(4):162-169. doi: 10.1049/iet-syb.2017.0044.

Integrative machine learning analysis of multiple gene expression profiles in cervical cancer.

PeerJ. 2018 Jul 25;6:e5285. doi: 10.7717/peerj.5285. eCollection 2018.

Clinical application of modified bag-of-features coupled with hybrid neural-based classifier in dengue fever classification using gene expression data.

Med Biol Eng Comput. 2018 Apr;56(4):709-720. doi: 10.1007/s11517-017-1722-y. Epub 2017 Sep 11.

本文引用的文献

Global analysis of cell type-specific gene expression.

Comp Funct Genomics. 2003;4(2):208-15. doi: 10.1002/cfg.281.

Ensemble machine learning on gene expression data for cancer classification.

Appl Bioinformatics. 2003;2(3 Suppl):S75-83.

A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns.

Genome Inform. 2002;13:51-60.

Discovery of significant rules for classifying cancer diagnosis data.

Bioinformatics. 2003 Oct;19 Suppl 2:ii93-102. doi: 10.1093/bioinformatics/btg1066.

Shrinkage-based similarity metric for cluster analysis of microarray data.

Proc Natl Acad Sci U S A. 2003 Aug 19;100(17):9668-73. doi: 10.1073/pnas.1633770100. Epub 2003 Aug 5.

Boosting for tumor classification with gene expression data.

Bioinformatics. 2003 Jun 12;19(9):1061-9. doi: 10.1093/bioinformatics/btf867.

Global stage-specific gene regulation during the developmental cycle of Chlamydia trachomatis.

J Bacteriol. 2003 May;185(10):3179-89. doi: 10.1128/JB.185.10.3179-3189.2003.

Neural network analysis of lymphoma microarray data: prognosis and diagnosis near-perfect.

BMC Bioinformatics. 2003 Apr 10;4:13. doi: 10.1186/1471-2105-4-13.

Improved gene selection for classification of microarrays.

Pac Symp Biocomput. 2003:53-64. doi: 10.1142/9789812776303_0006.

Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma.

Cancer Res. 2002 Sep 1;62(17):4963-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于基因表达数据分类的组合特征选择与集成神经网络方法。

A combinational feature selection and ensemble neural network method for classification of gene expression data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献