Suppr超能文献

一种用于优化鉴别性基因簇的稳定迭代方法。

A stable iterative method for refining discriminative gene clusters.

作者信息

Xu Min, Zhu Mengxia, Zhang Louxin

机构信息

Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA, USA.

出版信息

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S18. doi: 10.1186/1471-2164-9-S2-S18.

Abstract

BACKGROUND

Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem.

RESULTS

We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches.

CONCLUSION

This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.

摘要

背景

微阵列技术常用于识别在两种生物学条件下差异表达的基因。另一方面,由于微阵列数据集包含少量样本和大量基因,通常希望识别出在样本类别之间具有独特模式的小基因子集。由于这些基因子集具有紧密耦合的特征,因此在表型分类中具有高度的判别力。不幸的是,由于过拟合问题,这样识别出的分类器在测试样本上通常倾向于具有较差的泛化性能。

结果

我们提出了一种将监督学习与无监督学习技术相结合的新方法,以迭代方式生成具有越来越高判别力的基因簇。我们在模拟数据集和真实数据集上的实验表明,与现有方法相比,我们的方法可以产生一系列具有良好分类性能的稳健基因簇。

结论

这种用于为分类目的细化一系列高判别力基因簇的反向方法在应用于各种类型的训练样本时被证明是非常一致和稳定的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/86b0/2559882/63d475db888b/1471-2164-9-S2-S18-1.jpg

相似文献

1
A stable iterative method for refining discriminative gene clusters.
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S18. doi: 10.1186/1471-2164-9-S2-S18.
2
Ensemble gene selection by grouping for microarray data classification.
J Biomed Inform. 2010 Feb;43(1):81-7. doi: 10.1016/j.jbi.2009.08.010. Epub 2009 Aug 20.
3
Optimal approach for classification of acute leukemia subtypes based on gene expression data.
Biotechnol Prog. 2002 Jul-Aug;18(4):847-54. doi: 10.1021/bp025517o.
4
Gene selection from microarray data for cancer classification--a machine learning approach.
Comput Biol Chem. 2005 Feb;29(1):37-46. doi: 10.1016/j.compbiolchem.2004.11.001.
5
Tumor classification ranking from microarray data.
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.
6
Unsupervised clustering in mRNA expression profiles.
Comput Biol Med. 2006 Oct;36(10):1126-42. doi: 10.1016/j.compbiomed.2005.09.003. Epub 2005 Oct 24.
7
Biomarker discovery across annotated and unannotated microarray datasets using semi-supervised learning.
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-9-S2-S7.
8
Differential gene expression detection and sample classification using penalized linear regression models.
Bioinformatics. 2006 Feb 15;22(4):472-6. doi: 10.1093/bioinformatics/bti827. Epub 2005 Dec 13.
9
Optimal number of features as a function of sample size for various classification rules.
Bioinformatics. 2005 Apr 15;21(8):1509-15. doi: 10.1093/bioinformatics/bti171. Epub 2004 Nov 30.
10
Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers.
Biosystems. 2007 Jul-Aug;90(1):78-86. doi: 10.1016/j.biosystems.2006.07.002. Epub 2006 Jul 10.

引用本文的文献

本文引用的文献

2
Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data.
BMC Bioinformatics. 2006 Apr 10;7:197. doi: 10.1186/1471-2105-7-197.
4
Simultaneous gene clustering and subset selection for sample classification via MDL.
Bioinformatics. 2003 Jun 12;19(9):1100-9. doi: 10.1093/bioinformatics/btg039.
5
Improved gene selection for classification of microarrays.
Pac Symp Biocomput. 2003:53-64. doi: 10.1142/9789812776303_0006.
6
Supervised clustering of genes.
Genome Biol. 2002;3(12):RESEARCH0069. doi: 10.1186/gb-2002-3-12-research0069. Epub 2002 Nov 25.
7
Selection bias in gene extraction on the basis of microarray gene-expression data.
Proc Natl Acad Sci U S A. 2002 May 14;99(10):6562-6. doi: 10.1073/pnas.102102699. Epub 2002 Apr 30.
9
Biomarker identification by feature wrappers.
Genome Res. 2001 Nov;11(11):1878-87. doi: 10.1101/gr.190001.
10
Predicting the clinical status of human breast cancer by using gene expression profiles.
Proc Natl Acad Sci U S A. 2001 Sep 25;98(20):11462-7. doi: 10.1073/pnas.201162998. Epub 2001 Sep 18.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验