Clarke Erik L, Loguercio Salvatore, Good Benjamin M, Su Andrew I
The Scripps Research Institute, La Jolla, CA, USA.
J Biomed Semantics. 2013 Apr 15;4 Suppl 1(Suppl 1):S4. doi: 10.1186/2041-1480-4-S1-S4.
The Gene Ontology and its associated annotations are critical tools for interpreting lists of genes. Here, we introduce a method for evaluating the Gene Ontology annotations and structure based on the impact they have on gene set enrichment analysis, along with an example implementation. This task-based approach yields quantitative assessments grounded in experimental data and anchored tightly to the primary use of the annotations.
Applied to specific areas of biological interest, our framework allowed us to understand the progress of annotation and structural ontology changes from 2004 to 2012. Our framework was also able to determine that the quality of annotations and structure in the area under test have been improving in their ability to recall underlying biological traits. Furthermore, we were able to distinguish between the impact of changes to the annotation sets and ontology structure.
Our framework and implementation lay the groundwork for a powerful tool in evaluating the usefulness of the Gene Ontology. We demonstrate both the flexibility and the power of this approach in evaluating the current and past state of the Gene Ontology as well as its applicability in developing new methods for creating gene annotations.
基因本体论及其相关注释是解释基因列表的关键工具。在此,我们介绍一种基于基因本体论注释和结构对基因集富集分析的影响来评估它们的方法,并给出一个示例实现。这种基于任务的方法产生了基于实验数据的定量评估,并紧密锚定在注释的主要用途上。
应用于生物学感兴趣的特定领域,我们的框架使我们能够了解2004年至2012年注释和结构本体变化的进展。我们的框架还能够确定测试区域中注释和结构在召回潜在生物学特征方面的质量一直在提高。此外,我们能够区分注释集变化和本体结构变化的影响。
我们的框架和实现为评估基因本体论的有用性奠定了强大工具的基础。我们展示了这种方法在评估基因本体论的当前和过去状态以及其在开发创建基因注释新方法中的适用性方面的灵活性和强大功能。