基于模型的聚类中基因注释与基因表达数据的结合：加权方法。

Combining gene annotations and gene expression data in model-based clustering: weighted method.

作者信息

Huang Desheng, Wei Peng, Pan Wei

机构信息

Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, 55455, USA.

出版信息

OMICS. 2006 Spring;10(1):28-39. doi: 10.1089/omi.2006.10.28.

DOI:10.1089/omi.2006.10.28

PMID:16584316

Abstract

It has been increasingly recognized that incorporating prior knowledge into cluster analysis can result in more reliable and meaningful clusters. In contrast to the standard modelbased clustering with a global mixture model, which does not use any prior information, a stratified mixture model was recently proposed to incorporate gene functions or biological pathways as priors in model-based clustering of gene expression profiles: various gene functional groups form the strata in a stratified mixture model. Albeit useful, the stratified method may be less efficient than the global analysis if the strata are non-informative to clustering. We propose a weighted method that aims to strike a balance between a stratified analysis and a global analysis: it weights between the clustering results of the stratified analysis and that of the global analysis; the weight is determined by data. More generally, the weighted method can take advantage of the hierarchical structure of most existing gene functional annotation systems, such as MIPS and Gene Ontology (GO), and facilitate choosing appropriate gene functional groups as priors. We use simulated data and real data to demonstrate the feasibility and advantages of the proposed method.

摘要

越来越多的人认识到，将先验知识纳入聚类分析可以得到更可靠、更有意义的聚类结果。与不使用任何先验信息的基于全局混合模型的标准模型聚类不同，最近有人提出了一种分层混合模型，将基因功能或生物途径作为先验信息纳入基因表达谱的基于模型的聚类中：在分层混合模型中，各种基因功能组构成了层次。尽管分层方法很有用，但如果这些层次对聚类没有信息价值，那么它可能不如全局分析有效。我们提出了一种加权方法，旨在在分层分析和全局分析之间取得平衡：它对分层分析的聚类结果和全局分析的聚类结果进行加权；权重由数据决定。更一般地说，加权方法可以利用大多数现有基因功能注释系统（如MIPS和基因本体论（GO））的层次结构，并有助于选择合适的基因功能组作为先验信息。我们使用模拟数据和实际数据来证明所提出方法的可行性和优势。

相似文献

Combining gene annotations and gene expression data in model-based clustering: weighted method.

OMICS. 2006 Spring;10(1):28-39. doi: 10.1089/omi.2006.10.28.

Incorporating gene functions as priors in model-based clustering of microarray gene expression data.

Bioinformatics. 2006 Apr 1;22(7):795-801. doi: 10.1093/bioinformatics/btl011. Epub 2006 Jan 24.

Knowledge-assisted recognition of cluster boundaries in gene expression data.

Artif Intell Med. 2005 Sep-Oct;35(1-2):171-83. doi: 10.1016/j.artmed.2005.02.007.

caBIG VISDA: modeling, visualization, and discovery for cluster analysis of genomic data.

BMC Bioinformatics. 2008 Sep 18;9:383. doi: 10.1186/1471-2105-9-383.

Evaluation and comparison of gene clustering methods in microarray analysis.

Bioinformatics. 2006 Oct 1;22(19):2405-12. doi: 10.1093/bioinformatics/btl406. Epub 2006 Jul 31.

Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.

J Mol Biol. 2001 Dec 14;314(5):1053-66. doi: 10.1006/jmbi.2000.5219.

Incorporating gene ontology into fuzzy relational clustering of microarray gene expression data.

Biosystems. 2018 Jan;163:1-10. doi: 10.1016/j.biosystems.2017.09.017. Epub 2017 Nov 4.

Combining multisource information through functional-annotation-based weighting: gene function prediction in yeast.

IEEE Trans Biomed Eng. 2009 Feb;56(2):229-36. doi: 10.1109/TBME.2008.2005955. Epub 2008 Sep 30.

A novel approach for discovering overlapping clusters in gene expression data.

IEEE Trans Biomed Eng. 2009 Jul;56(7):1803-9. doi: 10.1109/TBME.2009.2015055. Epub 2009 Feb 20.

Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data.

Bioinformatics. 2006 May 15;22(10):1259-68. doi: 10.1093/bioinformatics/btl065. Epub 2006 Feb 24.

引用本文的文献

Application of genetic/genomic approaches to allergic disorders.

J Allergy Clin Immunol. 2010 Sep;126(3):425-36; quiz 437-8. doi: 10.1016/j.jaci.2010.05.025. Epub 2010 Jul 16.

CLEAN: CLustering Enrichment ANalysis.

BMC Bioinformatics. 2009 Jul 29;10:234. doi: 10.1186/1471-2105-10-234.

Biomedical ontologies in action: role in knowledge management, data integration and decision support.

Yearb Med Inform. 2008:67-79.

Fuzzy c-means clustering with prior biological knowledge.

J Biomed Inform. 2009 Feb;42(1):74-81. doi: 10.1016/j.jbi.2008.05.009. Epub 2008 May 24.

Microarray data mining using landmark gene-guided clustering.

BMC Bioinformatics. 2008 Feb 11;9:92. doi: 10.1186/1471-2105-9-92.

Inferring biological functions and associated transcriptional regulators using gene set expression coherence analysis.

BMC Bioinformatics. 2007 Nov 17;8:453. doi: 10.1186/1471-2105-8-453.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于模型的聚类中基因注释与基因表达数据的结合：加权方法。

Combining gene annotations and gene expression data in model-based clustering: weighted method.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献