Suppr超能文献

使用基因本体论解释实验结果。

Interpreting experimental results using gene ontologies.

作者信息

Beissbarth Tim

机构信息

The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Group, Victoria, Australia.

出版信息

Methods Enzymol. 2006;411:340-52. doi: 10.1016/S0076-6879(06)11018-6.

Abstract

High-throughput experimental techniques, such as microarrays, produce large amounts of data and knowledge about gene expression levels. However, interpretation of these data and turning it into biologically meaningful knowledge can be challenging. Frequently the output of such an analysis is a list of significant genes or a ranked list of genes. In the case of DNA microarray studies, data analysis often leads to lists of hundreds of differentially expressed genes. Also, clustering of gene expression data may lead to clusters of tens to hundreds of genes. These data are of little use if one is not able to interpret the results in a biological context. The Gene Ontology Consortium provides a controlled vocabulary to annotate the biological knowledge we have or that is predicted for a given gene. The Gene Ontologies (GOs) are organized as a hierarchy of annotation terms that facilitate an analysis and interpretation at different levels. The top-level ontologies are molecular function, biological process, and cellular component. Several annotation databases for genes of different organisms exist. This chapter describes how to use GO in order to help biologically interpret the lists of genes resulting from high-throughput experiments. It describes some statistical methods to find significantly over- or underrepresented GO terms within a list of genes and describes some tools and how to use them in order to do such an analysis. This chapter focuses primarily on the tool GOstat (http://gostat.wehi.edu.au). Other tools exist that enable similar analyses, but are not described in detail here.

摘要

高通量实验技术,如微阵列,可产生大量有关基因表达水平的数据和知识。然而,对这些数据进行解释并将其转化为具有生物学意义的知识可能具有挑战性。通常,此类分析的输出是一份重要基因列表或一份基因排名列表。在DNA微阵列研究中,数据分析往往会得出数百个差异表达基因的列表。此外,基因表达数据的聚类可能会导致由数十到数百个基因组成的簇。如果无法在生物学背景下解释结果,这些数据几乎毫无用处。基因本体联合会提供了一个受控词汇表,用于注释我们所拥有的或针对给定基因预测的生物学知识。基因本体(GO)被组织成一个注释术语层次结构,便于在不同层面进行分析和解释。顶级本体是分子功能、生物过程和细胞成分。存在针对不同生物体基因的多个注释数据库。本章描述了如何使用GO来帮助从生物学角度解释高通量实验产生的基因列表。它介绍了一些统计方法,用于在基因列表中找到显著过度或不足代表的GO术语,并描述了一些工具以及如何使用它们来进行此类分析。本章主要关注工具GOstat(http://gostat.wehi.edu.au)。还存在其他能够进行类似分析的工具,但此处不做详细描述。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验