Suppr超能文献

与微阵列癌症分类相关且重要的有监督基因簇。

Relevant and significant supervised gene clusters for microarray cancer classification.

机构信息

Machine Intelligence Unit, Indian Statistical Institute, 203 B. T. Road, Kolkata, 700 108, India.

出版信息

IEEE Trans Nanobioscience. 2012 Jun;11(2):161-8. doi: 10.1109/TNB.2012.2193590. Epub 2012 Apr 27.

Abstract

An important application of microarray data in functional genomics is to classify samples according to their gene expression profiles such as to classify cancer versus normal samples or to classify different types or subtypes of cancer. One of the major tasks with gene expression data is to find co-regulated gene groups whose collective expression is strongly associated with sample categories. In this regard, a gene clustering algorithm is proposed to group genes from microarray data. It directly incorporates the information of sample categories in the grouping process for finding groups of co-regulated genes with strong association to the sample categories, yielding a supervised gene clustering algorithm. The average expression of the genes from each cluster acts as its representative. Some significant representatives are taken to form the reduced feature set to build the classifiers for cancer classification. The mutual information is used to compute both gene-gene redundancy and gene-class relevance. The performance of the proposed method, along with a comparison with existing methods, is studied on six cancer microarray data sets using the predictive accuracy of naive Bayes classifier, K-nearest neighbor rule, and support vector machine. An important finding is that the proposed algorithm is shown to be effective for identifying biologically significant gene clusters with excellent predictive capability.

摘要

微阵列数据在功能基因组学中的一个重要应用是根据其基因表达谱对样本进行分类,例如将癌症与正常样本分类,或将不同类型或亚型的癌症分类。基因表达数据的主要任务之一是找到共同调节的基因群,其集体表达与样本类别强烈相关。在这方面,提出了一种基因聚类算法来对微阵列数据中的基因进行分组。它在分组过程中直接纳入样本类别的信息,以找到与样本类别强烈相关的共同调节基因组,从而产生有监督的基因聚类算法。每个聚类的基因的平均表达作为其代表。选择一些重要的代表来形成减少的特征集,以构建用于癌症分类的分类器。互信息用于计算基因-基因冗余和基因-类别相关性。使用朴素贝叶斯分类器、K-最近邻规则和支持向量机在六个癌症微阵列数据集上研究了所提出方法的性能,并与现有方法进行了比较。一个重要的发现是,所提出的算法被证明能够有效地识别具有出色预测能力的生物学上有意义的基因簇。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验