用于基因功能预测的生物聚类评估

Biological cluster evaluation for gene function prediction.

作者信息

Klie Sebastian, Nikoloski Zoran, Selbig Joachim

机构信息

1 Max-Planck Institute for Molecular Plant Physiology , Potsdam, Brandenburg, Germany .

出版信息

J Comput Biol. 2014 Jun;21(6):428-45. doi: 10.1089/cmb.2009.0129. Epub 2010 Jan 8.

DOI:10.1089/cmb.2009.0129

PMID:20059365

Abstract

Recent advances in high-throughput omics techniques render it possible to decode the function of genes by using the "guilt-by-association" principle on biologically meaningful clusters of gene expression data. However, the existing frameworks for biological evaluation of gene clusters are hindered by two bottleneck issues: (1) the choice for the number of clusters, and (2) the external measures which do not take in consideration the structure of the analyzed data and the ontology of the existing biological knowledge. Here, we address the identified bottlenecks by developing a novel framework that allows not only for biological evaluation of gene expression clusters based on existing structured knowledge, but also for prediction of putative gene functions. The proposed framework facilitates propagation of statistical significance at each of the following steps: (1) estimating the number of clusters, (2) evaluating the clusters in terms of novel external structural measures, (3) selecting an optimal clustering algorithm, and (4) predicting gene functions. The framework also includes a method for evaluation of gene clusters based on the structure of the employed ontology. Moreover, our method for obtaining a probabilistic range for the number of clusters is demonstrated valid on synthetic data and available gene expression profiles from Saccharomyces cerevisiae. Finally, we propose a network-based approach for gene function prediction which relies on the clustering of optimal score and the employed ontology. Our approach effectively predicts gene function on the Saccharomyces cerevisiae data set and is also employed to obtain putative gene functions for an Arabidopsis thaliana data set.

摘要

高通量组学技术的最新进展使得通过对具有生物学意义的基因表达数据聚类运用“关联有罪”原则来解码基因功能成为可能。然而，现有的基因簇生物学评估框架受到两个瓶颈问题的阻碍：（1）簇数量的选择，以及（2）外部度量未考虑所分析数据的结构和现有生物学知识的本体。在此，我们通过开发一种新颖的框架来解决已确定的瓶颈，该框架不仅允许基于现有结构化知识对基因表达簇进行生物学评估，还能预测推定的基因功能。所提出的框架在以下每个步骤中都有助于统计显著性的传播：（1）估计簇的数量，（2）根据新颖的外部结构度量评估簇，（3）选择最优聚类算法，以及（4）预测基因功能。该框架还包括一种基于所采用本体的结构评估基因簇的方法。此外，我们获得簇数量概率范围的方法在合成数据和酿酒酵母可用基因表达谱上被证明是有效的。最后，我们提出一种基于网络的基因功能预测方法，该方法依赖于最优分数的聚类和所采用的本体。我们的方法有效地在酿酒酵母数据集上预测了基因功能，并且还被用于获取拟南芥数据集的推定基因功能。

相似文献

Biological cluster evaluation for gene function prediction.

J Comput Biol. 2014 Jun;21(6):428-45. doi: 10.1089/cmb.2009.0129. Epub 2010 Jan 8.

Minimum spanning trees for gene expression data clustering.

Genome Inform. 2001;12:24-33.

Multi-stage filtering for improving confidence level and determining dominant clusters in clustering algorithms of gene expression data.

Comput Biol Med. 2013 Sep;43(9):1120-33. doi: 10.1016/j.compbiomed.2013.05.011. Epub 2013 May 31.

Knowledge-assisted recognition of cluster boundaries in gene expression data.

Artif Intell Med. 2005 Sep-Oct;35(1-2):171-83. doi: 10.1016/j.artmed.2005.02.007.

Genome-scale gene function prediction using multiple sources of high-throughput data in yeast Saccharomyces cerevisiae.

OMICS. 2004 Winter;8(4):322-33. doi: 10.1089/omi.2004.8.322.

Fuzzy c-means clustering with prior biological knowledge.

J Biomed Inform. 2009 Feb;42(1):74-81. doi: 10.1016/j.jbi.2008.05.009. Epub 2008 May 24.

Quantitative inference of dynamic regulatory pathways via microarray data.

BMC Bioinformatics. 2005 Mar 7;6:44. doi: 10.1186/1471-2105-6-44.

Co-clustering and visualization of gene expression data and gene ontology terms for Saccharomyces cerevisiae using self-organizing maps.

J Biomed Inform. 2007 Apr;40(2):160-73. doi: 10.1016/j.jbi.2006.05.001. Epub 2006 May 20.

Combining multisource information through functional-annotation-based weighting: gene function prediction in yeast.

IEEE Trans Biomed Eng. 2009 Feb;56(2):229-36. doi: 10.1109/TBME.2008.2005955. Epub 2008 Sep 30.

Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

Gene. 2012 Jul 15;503(1):56-64. doi: 10.1016/j.gene.2012.04.043. Epub 2012 Apr 24.

引用本文的文献

Conserved co-functional network between maize and Arabidopsis aid in the identification of seed defective genes in maize.

Genes Genomics. 2021 May;43(5):433-446. doi: 10.1007/s13258-021-01067-2. Epub 2021 Mar 2.

Unified feature association networks through integration of transcriptomic and proteomic data.

PLoS Comput Biol. 2019 Sep 17;15(9):e1007241. doi: 10.1371/journal.pcbi.1007241. eCollection 2019 Sep.

Prediction of key gene function in spinal muscular atrophy using guilt by association method based on network and gene ontology.

Exp Ther Med. 2019 Apr;17(4):2561-2566. doi: 10.3892/etm.2019.7216. Epub 2019 Jan 29.

Evolution of herbivore-induced early defense signaling was shaped by genome-wide duplications in .

Elife. 2016 Nov 4;5:e19531. doi: 10.7554/eLife.19531.

The Choice between MapMan and Gene Ontology for Automated Gene Function Prediction in Plant Science.

Front Genet. 2012 Jun 28;3:115. doi: 10.3389/fgene.2012.00115. eCollection 2012.

Functional genome annotation of Drosophila seminal fluid proteins using transcriptional genetic networks.

Genet Res (Camb). 2011 Dec;93(6):387-95. doi: 10.1017/S0016672311000346.

PlaNet: combined sequence and expression comparisons across plant networks derived from seven species.

Plant Cell. 2011 Mar;23(3):895-910. doi: 10.1105/tpc.111.083667. Epub 2011 Mar 25.

Speeding up the Consensus Clustering methodology for microarray data analysis.

Algorithms Mol Biol. 2011 Jan 14;6(1):1. doi: 10.1186/1748-7188-6-1.

Metabolomic and transcriptomic stress response of Escherichia coli.

Mol Syst Biol. 2010 May 11;6:364. doi: 10.1038/msb.2010.18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于基因功能预测的生物聚类评估

Biological cluster evaluation for gene function prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献