Suppr超能文献

应用于基因表达数据分析的多类预测的遗传算法。

Genetic algorithms applied to multi-class prediction for the analysis of gene expression data.

作者信息

Ooi C H, Tan Patrick

机构信息

Nanyang Technological University, School of Mechanical and Production Engineering, 50 Nanyang Avenue, Singapore 639798, Republic of Singapore.

出版信息

Bioinformatics. 2003 Jan;19(1):37-44. doi: 10.1093/bioinformatics/19.1.37.

Abstract

MOTIVATION

An important challenge in the use of large-scale gene expression data for biological classification occurs when the expression dataset being analyzed involves multiple classes. Key issues that need to be addressed under such circumstances are the efficient selection of good predictive gene groups from datasets that are inherently 'noisy', and the development of new methodologies that can enhance the successful classification of these complex datasets.

METHODS

We have applied genetic algorithms (GAs) to the problem of multi-class prediction. A GA-based gene selection scheme is described that automatically determines the members of a predictive gene group, as well as the optimal group size, that maximizes classification success using a maximum likelihood (MLHD) classification method.

RESULTS

The GA/MLHD-based approach achieves higher classification accuracies than other published predictive methods on the same multi-class test dataset. It also permits substantial feature reduction in classifier genesets without compromising predictive accuracy. We propose that GA-based algorithms may represent a powerful new tool in the analysis and exploration of complex multi-class gene expression data.

AVAILABILITY

Supplementary information, data sets and source codes are available at http://www.omniarray.com/bioinformatics/GA.

摘要

动机

当所分析的基因表达数据集涉及多个类别时,在利用大规模基因表达数据进行生物学分类方面会出现一个重要挑战。在这种情况下需要解决的关键问题是,从本质上“有噪声”的数据集中高效选择良好的预测基因组,以及开发能够提高这些复杂数据集成功分类的新方法。

方法

我们将遗传算法(GA)应用于多类别预测问题。描述了一种基于GA的基因选择方案,该方案使用最大似然(MLHD)分类方法自动确定预测基因组的成员以及最优组大小,以最大化分类成功率。

结果

基于GA/MLHD的方法在相同的多类别测试数据集上比其他已发表的预测方法取得了更高的分类准确率。它还允许在不影响预测准确性的情况下大幅减少分类器基因集的特征数量。我们提出基于GA的算法可能是分析和探索复杂多类别基因表达数据的一种强大新工具。

可用性

补充信息、数据集和源代码可在http://www.omniarray.com/bioinformatics/GA获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验