DataBionics Research Group, University of Marburg, Marburg, Germany.
Institute of Clinical Pharmacology, Goethe - University, Frankfurt am Main, Germany ; Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Project Group Translational Medicine and Pharmacology TMP, Frankfurt am Main, Germany.
PLoS One. 2014 Feb 25;9(2):e90191. doi: 10.1371/journal.pone.0090191. eCollection 2014.
Computational analyses of functions of gene sets obtained in microarray analyses or by topical database searches are increasingly important in biology. To understand their functions, the sets are usually mapped to Gene Ontology knowledge bases by means of over-representation analysis (ORA). Its result represents the specific knowledge of the functionality of the gene set. However, the specific ontology typically consists of many terms and relationships, hindering the understanding of the 'main story'. We developed a methodology to identify a comprehensibly small number of GO terms as "headlines" of the specific ontology allowing to understand all central aspects of the roles of the involved genes. The Functional Abstraction method finds a set of headlines that is specific enough to cover all details of a specific ontology and is abstract enough for human comprehension. This method exceeds the classical approaches at ORA abstraction and by focusing on information rather than decorrelation of GO terms, it directly targets human comprehension. Functional abstraction provides, with a maximum of certainty, information value, coverage and conciseness, a representation of the biological functions in a gene set plays a role. This is the necessary means to interpret complex Gene Ontology results thus strengthening the role of functional genomics in biomarker and drug discovery.
在生物学中,对从微阵列分析或主题数据库搜索中获得的基因集的功能进行计算分析变得越来越重要。为了理解它们的功能,通常通过过度代表分析 (ORA) 将这些集合映射到基因本体论知识库。其结果代表了基因集功能的特定知识。然而,特定的本体通常由许多术语和关系组成,这阻碍了对“主要故事”的理解。我们开发了一种方法,可以识别数量可理解的少量 GO 术语作为特定本体的“标题”,从而可以理解所涉及基因的所有核心方面。功能抽象方法找到的标题集足够具体,可以涵盖特定本体的所有细节,并且足够抽象,便于人类理解。该方法超越了 ORA 抽象的经典方法,通过关注信息而不是 GO 术语的去相关,它直接针对人类的理解。功能抽象以最大的确定性提供了信息值、覆盖率和简洁性,代表了基因集中的生物学功能所扮演的角色。这是解释复杂基因本体论结果的必要手段,从而加强了功能基因组学在生物标志物和药物发现中的作用。