一种用于优化鉴别性基因簇的稳定迭代方法。

A stable iterative method for refining discriminative gene clusters.

作者信息

Xu Min, Zhu Mengxia, Zhang Louxin

机构信息

Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA, USA.

出版信息

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S18. doi: 10.1186/1471-2164-9-S2-S18.

DOI:10.1186/1471-2164-9-S2-S18

PMID:18831783

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2559882/

Abstract

BACKGROUND

Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem.

RESULTS

We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches.

CONCLUSION

This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.

摘要

背景

微阵列技术常用于识别在两种生物学条件下差异表达的基因。另一方面，由于微阵列数据集包含少量样本和大量基因，通常希望识别出在样本类别之间具有独特模式的小基因子集。由于这些基因子集具有紧密耦合的特征，因此在表型分类中具有高度的判别力。不幸的是，由于过拟合问题，这样识别出的分类器在测试样本上通常倾向于具有较差的泛化性能。

结果

我们提出了一种将监督学习与无监督学习技术相结合的新方法，以迭代方式生成具有越来越高判别力的基因簇。我们在模拟数据集和真实数据集上的实验表明，与现有方法相比，我们的方法可以产生一系列具有良好分类性能的稳健基因簇。

结论

这种用于为分类目的细化一系列高判别力基因簇的反向方法在应用于各种类型的训练样本时被证明是非常一致和稳定的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/86b0/2559882/63d475db888b/1471-2164-9-S2-S18-1.jpg

相似文献

A stable iterative method for refining discriminative gene clusters.一种用于优化鉴别性基因簇的稳定迭代方法。

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S18. doi: 10.1186/1471-2164-9-S2-S18.

Ensemble gene selection by grouping for microarray data classification.基于分组的微阵列数据分类的集成基因选择。

J Biomed Inform. 2010 Feb;43(1):81-7. doi: 10.1016/j.jbi.2009.08.010. Epub 2009 Aug 20.

Optimal approach for classification of acute leukemia subtypes based on gene expression data.基于基因表达数据的急性白血病亚型分类的优化方法。

Biotechnol Prog. 2002 Jul-Aug;18(4):847-54. doi: 10.1021/bp025517o.

Gene selection from microarray data for cancer classification--a machine learning approach.基于机器学习方法从微阵列数据中进行癌症分类的基因选择

Comput Biol Chem. 2005 Feb;29(1):37-46. doi: 10.1016/j.compbiolchem.2004.11.001.

Tumor classification ranking from microarray data.基于微阵列数据的肿瘤分类排名

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.

Unsupervised clustering in mRNA expression profiles.mRNA表达谱中的无监督聚类

Comput Biol Med. 2006 Oct;36(10):1126-42. doi: 10.1016/j.compbiomed.2005.09.003. Epub 2005 Oct 24.

Biomarker discovery across annotated and unannotated microarray datasets using semi-supervised learning.使用半监督学习在有注释和无注释的微阵列数据集中发现生物标志物。

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-9-S2-S7.

Differential gene expression detection and sample classification using penalized linear regression models.使用惩罚线性回归模型进行差异基因表达检测和样本分类。

Bioinformatics. 2006 Feb 15;22(4):472-6. doi: 10.1093/bioinformatics/bti827. Epub 2005 Dec 13.

Optimal number of features as a function of sample size for various classification rules.针对各种分类规则，作为样本大小函数的最优特征数量。

Bioinformatics. 2005 Apr 15;21(8):1509-15. doi: 10.1093/bioinformatics/bti171. Epub 2004 Nov 30.

Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers.从微阵列数据中选择最少数量的相关基因以设计精确的组织分类器。

Biosystems. 2007 Jul-Aug;90(1):78-86. doi: 10.1016/j.biosystems.2006.07.002. Epub 2006 Jul 10.

引用本文的文献

CaSPIAN: a causal compressive sensing algorithm for discovering directed interactions in gene networks.CaSPIAN：一种用于发现基因网络中直接相互作用的因果压缩感知算法。

PLoS One. 2014 Mar 12;9(3):e90781. doi: 10.1371/journal.pone.0090781. eCollection 2014.

Genomics, molecular imaging, bioinformatics, and bio-nano-info integration are synergistic components of translational medicine and personalized healthcare research.基因组学、分子成像、生物信息学以及生物纳米信息整合是转化医学和个性化医疗研究的协同组成部分。

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):I1. doi: 10.1186/1471-2164-9-S2-I1.

本文引用的文献

Transcription network construction for large-scale microarray datasets using a high-performance computing approach.使用高性能计算方法构建大规模微阵列数据集的转录网络

BMC Genomics. 2008;9 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2164-9-S1-S5.

Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data.用于质谱和微阵列数据的递归支持向量机特征选择与样本分类

BMC Bioinformatics. 2006 Apr 10;7:197. doi: 10.1186/1471-2105-7-197.

Entropy-based gene ranking without selection bias for the predictive classification of microarray data.基于熵的基因排序，无选择偏差用于微阵列数据的预测分类

BMC Bioinformatics. 2003 Nov 6;4:54. doi: 10.1186/1471-2105-4-54.

Simultaneous gene clustering and subset selection for sample classification via MDL.通过最小描述长度实现用于样本分类的同步基因聚类和子集选择

Bioinformatics. 2003 Jun 12;19(9):1100-9. doi: 10.1093/bioinformatics/btg039.

Improved gene selection for classification of microarrays.用于微阵列分类的改进基因选择

Pac Symp Biocomput. 2003:53-64. doi: 10.1142/9789812776303_0006.

Supervised clustering of genes.基因的监督聚类

Genome Biol. 2002;3(12):RESEARCH0069. doi: 10.1186/gb-2002-3-12-research0069. Epub 2002 Nov 25.

Selection bias in gene extraction on the basis of microarray gene-expression data.基于微阵列基因表达数据进行基因提取时的选择偏倚。

Proc Natl Acad Sci U S A. 2002 May 14;99(10):6562-6. doi: 10.1073/pnas.102102699. Epub 2002 Apr 30.

Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning.通过基因表达谱分析和监督式机器学习预测弥漫性大B细胞淋巴瘤的预后

Nat Med. 2002 Jan;8(1):68-74. doi: 10.1038/nm0102-68.

Biomarker identification by feature wrappers.通过特征包装器进行生物标志物识别。

Genome Res. 2001 Nov;11(11):1878-87. doi: 10.1101/gr.190001.

Predicting the clinical status of human breast cancer by using gene expression profiles.利用基因表达谱预测人类乳腺癌的临床状态。

Proc Natl Acad Sci U S A. 2001 Sep 25;98(20):11462-7. doi: 10.1073/pnas.201162998. Epub 2001 Sep 18.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于优化鉴别性基因簇的稳定迭代方法。

A stable iterative method for refining discriminative gene clusters.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献