Fernández-Breis Jesualdo Tomás, Chiba Hirokazu, Legaz-García María Del Carmen, Uchiyama Ikuo
Departamento de Informática y Sistemas, Universidad de Murcia, IMIB-Arrixaca, Murcia, 30071, Spain.
National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, 444-8585, Aichi, Japan.
J Biomed Semantics. 2016 Jun 4;7(1):34. doi: 10.1186/s13326-016-0077-x.
Computational comparative analysis of multiple genomes provides valuable opportunities to biomedical research. In particular, orthology analysis can play a central role in comparative genomics; it guides establishing evolutionary relations among genes of organisms and allows functional inference of gene products. However, the wide variations in current orthology databases necessitate the research toward the shareability of the content that is generated by different tools and stored in different structures. Exchanging the content with other research communities requires making the meaning of the content explicit.
The need for a common ontology has led to the creation of the Orthology Ontology (ORTH) following the best practices in ontology construction. Here, we describe our model and major entities of the ontology that is implemented in the Web Ontology Language (OWL), followed by the assessment of the quality of the ontology and the application of the ORTH to existing orthology datasets. This shareable ontology enables the possibility to develop Linked Orthology Datasets and a meta-predictor of orthology through standardization for the representation of orthology databases. The ORTH is freely available in OWL format to all users at http://purl.org/net/orth .
The Orthology Ontology can serve as a framework for the semantic standardization of orthology content and it will contribute to a better exploitation of orthology resources in biomedical research. The results demonstrate the feasibility of developing shareable datasets using this ontology. Further applications will maximize the usefulness of this ontology.
多个基因组的计算比较分析为生物医学研究提供了宝贵机遇。特别是,直系同源分析在比较基因组学中可发挥核心作用;它有助于建立生物体基因间的进化关系,并能对基因产物进行功能推断。然而,当前直系同源数据库存在广泛差异,这使得有必要研究不同工具生成并以不同结构存储的内容的可共享性。与其他研究群体交换内容需要明确内容的含义。
对通用本体的需求促使我们按照本体构建的最佳实践创建了直系同源本体(ORTH)。在此,我们描述了以网络本体语言(OWL)实现的本体的模型和主要实体,随后评估了本体的质量以及ORTH在现有直系同源数据集上的应用。这个可共享的本体通过对直系同源数据库表示的标准化,使得开发链接直系同源数据集和直系同源元预测器成为可能。ORTH以OWL格式免费提供给所有用户,网址为http://purl.org/net/orth 。
直系同源本体可作为直系同源内容语义标准化的框架,并将有助于在生物医学研究中更好地利用直系同源资源。结果证明了使用该本体开发可共享数据集的可行性。进一步的应用将使该本体的实用性最大化。