Suppr超能文献

植物科学中用于自动基因功能预测的MapMan与基因本体论之间的选择

The Choice between MapMan and Gene Ontology for Automated Gene Function Prediction in Plant Science.

作者信息

Klie Sebastian, Nikoloski Zoran

机构信息

Genes and Small Molecules Group, Max-Planck Institute of Molecular Plant Physiology Potsdam-Golm, Germany.

出版信息

Front Genet. 2012 Jun 28;3:115. doi: 10.3389/fgene.2012.00115. eCollection 2012.

Abstract

Since the introduction of the Gene Ontology (GO), the analysis of high-throughput data has become tightly coupled with the use of ontologies to establish associations between knowledge and data in an automated fashion. Ontologies provide a systematic description of knowledge by a controlled vocabulary of defined structure in which ontological concepts are connected by pre-defined relationships. In plant science, MapMan and GO offer two alternatives for ontology-driven analyses. Unlike GO, initially developed to characterize microbial systems, MapMan was specifically designed to cover plant-specific pathways and processes. While the dependencies between concepts in MapMan are modeled as a tree, in GO these are captured in a directed acyclic graph. Therefore, the difference in ontologies may cause discrepancies in data reduction, visualization, and hypothesis generation. Here provide the first systematic comparative analysis of GO and MapMan for the case of the model plant species Arabidopsis thaliana (Arabidopsis) with respect to their structural properties and difference in distributions of information content. In addition, we investigate the effect of the two ontologies on the specificity and sensitivity of automated gene function prediction via the coupling of co-expression networks and the guilt-by-association principle. Automated gene function prediction is particularly needed for the model plant Arabidopsis in which only half of genes have been functionally annotated based on sequence similarity to known genes. The results highlight the need for structured representation of species-specific biological knowledge, and warrants caution in the design principles employed in future ontologies.

摘要

自从引入基因本体论(Gene Ontology,GO)以来,高通量数据的分析就与利用本体论以自动化方式在知识和数据之间建立关联紧密结合在一起。本体论通过具有定义结构的受控词汇表对知识进行系统描述,其中本体论概念通过预定义关系相互连接。在植物科学中,MapMan和GO为本体论驱动的分析提供了两种选择。与最初用于表征微生物系统而开发的GO不同,MapMan是专门为涵盖植物特有的途径和过程而设计的。虽然MapMan中概念之间的依赖关系被建模为一棵树,但在GO中这些依赖关系是在有向无环图中捕获的。因此,本体论的差异可能会导致在数据简化、可视化和假设生成方面出现差异。本文首次对模式植物物种拟南芥的GO和MapMan进行了系统的比较分析,涉及它们的结构特性以及信息内容分布的差异。此外,我们通过共表达网络与关联有罪原则的结合,研究了这两种本体论对自动基因功能预测的特异性和敏感性的影响。对于模式植物拟南芥来说,自动基因功能预测尤为必要,因为基于与已知基因的序列相似性,该植物中只有一半的基因得到了功能注释。研究结果凸显了对物种特异性生物学知识进行结构化表示的必要性,并警示了未来本体论所采用的设计原则。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f98/3384976/8408c545bd8e/fgene-03-00115-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验