Department of Clinical Sciences, U.T. Southwestern Medical Center, 5323 Harry Hines Boulevard Dallas, TX 75390-9072, USA.
Bioinformatics. 2010 Apr 1;26(7):905-11. doi: 10.1093/bioinformatics/btq059. Epub 2010 Feb 21.
MOTIVATION: A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. RESULTS: We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.
动机:一种典型的高通量实验解释方法,如基因表达微阵列,是根据某些标准(例如差异表达的基因)产生基因组。为了更深入地了解潜在生物学机制,通常进行过度表示分析(ORA),以研究与特定生物学功能相关的基因集是否在鉴定的基因组中具有统计学上的过度表示,例如由基因本体论(GO)注释表示。然而,基于超几何检验的标准 ORA 分别分析每个 GO 术语,并且不考虑 GO 术语层次结构的依赖关系结构。
结果:我们开发了一种贝叶斯方法(GO-Bayes)来测量 GO 术语的过度表示,该方法通过考虑不仅来自单个 GO 术语的证据,而且还考虑其相关术语(即父母、子女、兄弟姐妹等)来纳入 GO 依赖关系结构。贝叶斯框架在相关的 GO 术语之间借用信息,以加强对过度表示信号的检测。因此,这种方法倾向于识别密切相关的 GO 术语集,而不是单个孤立的 GO 术语。GO-Bayes 方法的优势通过模拟研究和应用示例得到了证明。
Bioinformatics. 2010-2-21
Bioinformatics. 2007-11-15
Bioinformatics. 2012-9-15
Nucleic Acids Res. 2005-7-1
IEEE/ACM Trans Comput Biol Bioinform. 2012
BMC Bioinformatics. 2010-9-12
BMC Bioinformatics. 2019-1-11
BMC Bioinformatics. 2016-10-6
Nucleic Acids Res. 2014-10
J Biomed Semantics. 2013-10-8
ISRN Microbiol. 2011-11-23
BMC Genomics. 2012-12-13
Bioinformatics. 2012-9-15
BMC Bioinformatics. 2009-1-6
BMC Cell Biol. 2008-12-18
Bioinformatics. 2007-11-15
Bioinformatics. 2007-1-15
BMC Bioinformatics. 2006-10-3