Alterovitz Gil, Xiang Michael, Mohan Mamta, Ramoni Marco F
Division of Health Sciences and Technology Harvard Medical School and Massachusetts Institute of Technology, Boston, MA, USA.
Nucleic Acids Res. 2007 Jan;35(Database issue):D322-7. doi: 10.1093/nar/gkl799. Epub 2006 Nov 10.
Gene Ontology (GO) has been widely used to infer functional significance associated with sets of genes in order to automate discoveries within large-scale genetic studies. A level in GO's direct acyclic graph structure is often assumed to be indicative of its terms' specificities, although other work has suggested this assumption does not hold. Unfortunately, quantitative analysis of biological functions based on nodes at the same level (as is common in gene enrichment analysis tools) can lead to incorrect conclusions as well as missed discoveries due to inefficient use of available information. This paper addresses these using an informational theoretic approach encoded in the GO Partition Database that guarantees to maximize information for gene enrichment analysis. The GO Partition Database was designed to feature ontology partitions with GO terms of similar specificity. The GO partitions comprise varying numbers of nodes and present relevant information theoretic statistics, so researchers can choose to analyze datasets at arbitrary levels of specificity. The GO Partition Database, featuring GO partition sets for functional analysis of genes from human and 10 other commonly studied organisms with a total of 131,972 genes, is available on the internet at: bcl.med.harvard.edu/proj/gopart. The site also includes an online tutorial.
基因本体论(Gene Ontology,GO)已被广泛用于推断与基因集相关的功能意义,以便在大规模基因研究中实现自动化发现。GO的直接无环图结构中的一个层级通常被认为能表明其术语的特异性,尽管其他研究表明这一假设并不成立。不幸的是,基于同一层级节点对生物学功能进行定量分析(这在基因富集分析工具中很常见)可能会导致错误结论,并且由于对可用信息的低效利用而错过发现。本文使用GO分区数据库中编码的信息论方法来解决这些问题,该方法保证能为基因富集分析最大化信息。GO分区数据库旨在以具有相似特异性的GO术语为特征构建本体分区。GO分区包含不同数量的节点并呈现相关的信息论统计数据,因此研究人员可以选择在任意特异性水平上分析数据集。GO分区数据库提供了用于对来自人类和其他10种常用研究生物体的基因进行功能分析的GO分区集,涉及总共131,972个基因,可在互联网上访问:bcl.med.harvard.edu/proj/gopart 。该网站还包括一个在线教程。