Suppr超能文献

基于基因本体论图的度量评估基因集的功能一致性。

Assessing the functional coherence of gene sets with metrics based on the Gene Ontology graph.

机构信息

Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC 29425, USA.

出版信息

Bioinformatics. 2010 Jun 15;26(12):i79-87. doi: 10.1093/bioinformatics/btq203.

Abstract

MOTIVATION

The results of initial analyses for many high-throughput technologies commonly take the form of gene or protein sets, and one of the ensuing tasks is to evaluate the functional coherence of these sets. The study of gene set function most commonly makes use of controlled vocabulary in the form of ontology annotations. For a given gene set, the statistical significance of observing these annotations or 'enrichment' may be tested using a number of methods. Instead of testing for significance of individual terms, this study is concerned with the task of assessing the global functional coherence of gene sets, for which novel metrics and statistical methods have been devised.

RESULTS

The metrics of this study are based on the topological properties of graphs comprised of genes and their Gene Ontology annotations. A novel aspect of these methods is that both the enrichment of annotations and the relationships among annotations are considered when determining the significance of functional coherence. We applied our methods to perform analyses on an existing database and on microarray experimental results. Here, we demonstrated that our approach is highly discriminative in terms of differentiating coherent gene sets from random ones and that it provides biologically sensible evaluations in microarray analysis. We further used examples to show the utility of graph visualization as a tool for studying the functional coherence of gene sets.

AVAILABILITY

The implementation is provided as a freely accessible web application at: http://projects.dbbe.musc.edu/gosteiner. Additionally, the source code written in the Python programming language, is available under the General Public License of the Free Software Foundation.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

许多高通量技术的初步分析结果通常采用基因或蛋白质集的形式,随之而来的任务之一是评估这些集的功能一致性。基因集功能的研究最常使用本体论注释形式的受控词汇。对于给定的基因集,可以使用多种方法来测试观察这些注释或“富集”的统计显著性。本研究不是测试单个术语的显著性,而是关注基因集全局功能一致性的任务,为此设计了新的指标和统计方法。

结果

本研究的指标基于由基因及其基因本体论注释组成的图的拓扑性质。这些方法的一个新颖之处在于,在确定功能一致性的显著性时,同时考虑注释的富集和注释之间的关系。我们应用我们的方法对现有数据库和微阵列实验结果进行了分析。在这里,我们证明了我们的方法在区分一致的基因集和随机的基因集方面具有高度的辨别力,并且在微阵列分析中提供了合理的生物学评估。我们还使用示例展示了图形可视化作为研究基因集功能一致性的工具的实用性。

可用性

该实现以免费访问的 Web 应用程序的形式提供:http://projects.dbbe.musc.edu/gosteiner。此外,用 Python 编程语言编写的源代码可根据自由软件基金会的通用公共许可证获得。

补充信息

补充数据可在生物信息学在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验