用于从微阵列数据中有效选择鉴别基因的f-信息度量。

f-Information measures for efficient selection of discriminative genes from microarray data.

作者信息

Maji Pradipta

机构信息

Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700 108, India.

出版信息

IEEE Trans Biomed Eng. 2009 Apr;56(4):1063-9. doi: 10.1109/TBME.2008.2004502. Epub 2008 Sep 16.

DOI:10.1109/TBME.2008.2004502

PMID:19272938

Abstract

Among the great amount of genes presented in microarray gene expression data, only a small fraction is effective for performing a certain diagnostic test. In this regard, mutual information has been shown to be successful for selecting a set of relevant and nonredundant genes from microarray data. However, information theory offers many more measures such as the f-information measures that may be suitable for selection of genes from microarray gene expression data. This paper presents different f-information measures as the evaluation criteria for gene selection problem. To compute the gene-gene redundancy (respectively, gene-class relevance), these information measures calculate the divergence of the joint distribution of two genes' expression values (respectively, the expression values of a gene and the class labels of samples) from the joint distribution when two genes (respectively, the gene and class label) are considered to be completely independent. The performance of different f-information measures is compared with that of the mutual information based on the predictive accuracy of naive Bayes classifier, K -nearest neighbor rule, and support vector machine. An important finding is that some f-information measures are shown to be effective for selecting relevant and nonredundant genes from microarray data. The effectiveness of different f-information measures, along with a comparison with mutual information, is demonstrated on breast cancer, leukemia, and colon cancer datasets. While some f -information measures provide 100% prediction accuracy for all three microarray datasets, mutual information attains this accuracy only for breast cancer dataset, and 98.6% and 93.6% for leukemia and colon cancer datasets, respectively.

摘要

在微阵列基因表达数据中呈现的大量基因中，只有一小部分对执行特定诊断测试有效。在这方面，互信息已被证明可成功地从微阵列数据中选择一组相关且非冗余的基因。然而，信息论还提供了更多的度量，例如f - 信息度量，这些度量可能适用于从微阵列基因表达数据中选择基因。本文提出了不同的f - 信息度量作为基因选择问题的评估标准。为了计算基因 - 基因冗余度（分别地，基因 - 类别相关性），这些信息度量计算当两个基因（分别地，基因和类别标签）被认为完全独立时，两个基因表达值的联合分布（分别地，一个基因的表达值和样本的类别标签）与联合分布的差异。基于朴素贝叶斯分类器、K近邻规则和支持向量机的预测准确性，将不同f - 信息度量的性能与互信息的性能进行了比较。一个重要的发现是，一些f - 信息度量被证明对从微阵列数据中选择相关且非冗余的基因是有效的。在乳腺癌、白血病和结肠癌数据集上展示了不同f - 信息度量的有效性以及与互信息的比较。虽然一些f - 信息度量对所有三个微阵列数据集都提供了100%的预测准确性，但互信息仅对乳腺癌数据集达到此准确性，对白血病和结肠癌数据集分别为98.6%和93.6%。

相似文献

f-Information measures for efficient selection of discriminative genes from microarray data.

IEEE Trans Biomed Eng. 2009 Apr;56(4):1063-9. doi: 10.1109/TBME.2008.2004502. Epub 2008 Sep 16.

Fuzzy-rough sets for information measures and selection of relevant genes from microarray data.

IEEE Trans Syst Man Cybern B Cybern. 2010 Jun;40(3):741-52. doi: 10.1109/TSMCB.2009.2028433. Epub 2009 Nov 3.

Filter versus wrapper gene selection approaches in DNA microarray domains.

Artif Intell Med. 2004 Jun;31(2):91-103. doi: 10.1016/j.artmed.2004.01.007.

Interpretable gene expression classifier with an accurate and compact fuzzy rule base for microarray data analysis.

Biosystems. 2006 Sep;85(3):165-76. doi: 10.1016/j.biosystems.2006.01.002. Epub 2006 Feb 21.

Relevant and significant supervised gene clusters for microarray cancer classification.

IEEE Trans Nanobioscience. 2012 Jun;11(2):161-8. doi: 10.1109/TNB.2012.2193590. Epub 2012 Apr 27.

Multiclass cancer classification by support vector machines with class-wise optimized genes and probability estimates.

J Theor Biol. 2009 Aug 7;259(3):533-40. doi: 10.1016/j.jtbi.2009.04.013. Epub 2009 May 3.

Identification of differential gene expression for microarray data using recursive random forest.

Chin Med J (Engl). 2008 Dec 20;121(24):2492-6.

Selection of relevant genes in cancer diagnosis based on their prediction accuracy.

Artif Intell Med. 2007 May;40(1):29-44. doi: 10.1016/j.artmed.2006.06.002. Epub 2006 Aug 22.

Gene selection and classification from microarray data using kernel machine.

FEBS Lett. 2004 Jul 30;571(1-3):93-8. doi: 10.1016/j.febslet.2004.05.087.

Gene selection from microarray data for cancer classification--a machine learning approach.

Comput Biol Chem. 2005 Feb;29(1):37-46. doi: 10.1016/j.compbiolchem.2004.11.001.

引用本文的文献

A review on methods for predicting miRNA-mRNA regulatory modules.

J Integr Bioinform. 2022 Apr 1;19(3). doi: 10.1515/jib-2020-0048. eCollection 2022 Sep 1.

Incorporating EBO-HSIC with SVM for Gene Selection Associated with Cervical Cancer Classification.

J Med Syst. 2018 Oct 6;42(11):225. doi: 10.1007/s10916-018-1092-5.

Rough sets for in silico identification of differentially expressed miRNAs.

Int J Nanomedicine. 2013;8 Suppl 1(Suppl 1):63-74. doi: 10.2147/IJN.S40739. Epub 2013 Sep 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于从微阵列数据中有效选择鉴别基因的f-信息度量。

f-Information measures for efficient selection of discriminative genes from microarray data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献