本体和分类法中术语和定义的质量控制。

Quality control for terms and definitions in ontologies and taxonomies.

作者信息

Köhler Jacob, Munn Katherine, Rüegg Alexander, Skusa Andre, Smith Barry

机构信息

Biomathematics and Bioinformatics Division, Rothamsted Research, Harpenden, UK.

出版信息

BMC Bioinformatics. 2006 Apr 19;7:212. doi: 10.1186/1471-2105-7-212.

DOI:10.1186/1471-2105-7-212

PMID:16623942

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1482721/

Abstract

BACKGROUND

Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. A series of recent papers has shown that the Gene Ontology (GO), the most prominent taxonomic resource in these fields, is marked by flaws of certain characteristic types, which flow from a failure to address basic ontological principles. As yet, no methods have been proposed which would allow ontology curators to pinpoint flawed terms or definitions in ontologies in a systematic way.

RESULTS

We present computational methods that automatically identify terms and definitions which are defined in a circular or unintelligible way. We further demonstrate the potential of these methods by applying them to isolate a subset of 6001 problematic GO terms. By automatically aligning GO with other ontologies and taxonomies we were able to propose alternative synonyms and definitions for some of these problematic terms. This allows us to demonstrate that these other resources do not contain definitions superior to those supplied by GO.

CONCLUSION

Our methods provide reliable indications of the quality of terms and definitions in ontologies and taxonomies. Further, they are well suited to assist ontology curators in drawing their attention to those terms that are ill-defined. We have further shown the limitations of ontology mapping and alignment in assisting ontology curators in rectifying problems, thus pointing to the need for manual curation.

摘要

背景

本体论和分类法是分子生物学和生物信息学中最重要的计算资源之一。最近的一系列论文表明，基因本体论（GO）作为这些领域中最突出的分类资源，存在某些特定类型的缺陷，这些缺陷源于未能遵循基本的本体论原则。到目前为止，尚未提出能够使本体论管理者系统地找出本体论中有缺陷的术语或定义的方法。

结果

我们提出了一些计算方法，这些方法可以自动识别以循环或难以理解的方式定义的术语和定义。我们通过将这些方法应用于分离出6001个有问题的GO术语的子集，进一步展示了这些方法的潜力。通过将GO与其他本体论和分类法自动对齐，我们能够为其中一些有问题的术语提出替代同义词和定义。这使我们能够证明，这些其他资源并不包含比GO提供的定义更好的定义。

结论

我们的方法为本体论和分类法中术语和定义的质量提供了可靠的指示。此外，它们非常适合帮助本体论管理者关注那些定义不明确的术语。我们还展示了本体映射和对齐在帮助本体论管理者纠正问题方面的局限性，从而指出了人工编目的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d6b/1482721/876ac6fa8322/1471-2105-7-212-1.jpg

相似文献

Quality control for terms and definitions in ontologies and taxonomies.本体和分类法中术语和定义的质量控制。

BMC Bioinformatics. 2006 Apr 19;7:212. doi: 10.1186/1471-2105-7-212.

Get ready to GO! A biologist's guide to the Gene Ontology.准备出发！基因本体论的生物学家指南。

Brief Bioinform. 2005 Sep;6(3):298-304. doi: 10.1093/bib/6.3.298.

Applying a biomedical top-level ontology to encode biological taxa.应用生物医学顶级本体对生物分类群进行编码。

AMIA Annu Symp Proc. 2008 Nov 6:882.

Automatic extension of Gene Ontology with flexible identification of candidate terms.通过灵活识别候选术语自动扩展基因本体论

Bioinformatics. 2006 Mar 15;22(6):665-70. doi: 10.1093/bioinformatics/btl010. Epub 2006 Jan 21.

Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI).基于人工智能的本体动态检索增强生成（DRAGON-AI）。

J Biomed Semantics. 2024 Oct 17;15(1):19. doi: 10.1186/s13326-024-00320-3.

Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation.研究基因本体中语义相似性度量：序列与注释之间的关系。

Bioinformatics. 2003 Jul 1;19(10):1275-83. doi: 10.1093/bioinformatics/btg153.

Annotating proteins by mining protein interaction networks.通过挖掘蛋白质相互作用网络对蛋白质进行注释。

Bioinformatics. 2006 Jul 15;22(14):e260-70. doi: 10.1093/bioinformatics/btl221.

GPSDB: a new database for synonyms expansion of gene and protein names.GPSDB：一个用于基因和蛋白质名称同义词扩展的新数据库。

Bioinformatics. 2005 Apr 15;21(8):1743-4. doi: 10.1093/bioinformatics/bti235. Epub 2004 Dec 21.

Text mining and ontologies in biomedicine: making sense of raw text.生物医学中的文本挖掘与本体论：解读原始文本

Brief Bioinform. 2005 Sep;6(3):239-51. doi: 10.1093/bib/6.3.239.

GOChase: correcting errors from Gene Ontology-based annotations for gene products.GOChase：纠正基于基因本体论的基因产物注释中的错误。

Bioinformatics. 2005 Mar;21(6):829-31. doi: 10.1093/bioinformatics/bti106. Epub 2004 Oct 28.

引用本文的文献

Formal axioms in biomedical ontologies improve analysis and interpretation of associated data.生物医学本体论中的形式公理可改善相关数据的分析和解释。

Bioinformatics. 2020 Apr 1;36(7):2229-2236. doi: 10.1093/bioinformatics/btz920.

Easy Extraction of Terms and Definitions with OWL2TL.使用OWL2TL轻松提取术语和定义。

CEUR Workshop Proc. 2016 Aug;1747.

Towards natural language question generation for the validation of ontologies and mappings.面向用于本体和映射验证的自然语言问题生成

J Biomed Semantics. 2016 Aug 8;7(1):48. doi: 10.1186/s13326-016-0089-6.

Measuring the evolution of ontology complexity: the gene ontology case study.测量本体复杂性的演变：基因本体案例研究。

PLoS One. 2013 Oct 11;8(10):e75993. doi: 10.1371/journal.pone.0075993. eCollection 2013.

A UML profile for the OBO relation ontology.用于 OBO 关系本体论的 UML 配置文件。

BMC Genomics. 2012;13 Suppl 5(Suppl 5):S3. doi: 10.1186/1471-2164-13-S5-S3. Epub 2012 Oct 19.

Species identification of marine fishes in china with DNA barcoding.中国海洋鱼类的 DNA 条形码物种鉴定。

Evid Based Complement Alternat Med. 2011;2011:978253. doi: 10.1155/2011/978253. Epub 2011 May 11.

Saliva Ontology: an ontology-based framework for a Salivaomics Knowledge Base.唾液本体论：基于本体的唾液组学知识库框架。

BMC Bioinformatics. 2010 Jun 3;11:302. doi: 10.1186/1471-2105-11-302.

Ontology quality assurance through analysis of term transformations.通过术语转换分析实现本体质量保证。

Bioinformatics. 2009 Jun 15;25(12):i77-84. doi: 10.1093/bioinformatics/btp195.

Experiences mapping a legacy interface terminology to SNOMED CT.将遗留接口术语映射到SNOMED CT的经验。

BMC Med Inform Decis Mak. 2008 Oct 27;8 Suppl 1(Suppl 1):S3. doi: 10.1186/1472-6947-8-S1-S3.

Structural group-based auditing of missing hierarchical relationships in UMLS.基于结构组的统一医学语言系统中缺失层次关系的审核。

J Biomed Inform. 2009 Jun;42(3):452-67. doi: 10.1016/j.jbi.2008.08.006. Epub 2008 Aug 20.

本文引用的文献

Obol: integrating language and meaning in bio-ontologies.奥博尔：在生物本体中整合语言与意义

Comp Funct Genomics. 2004;5(6-7):509-20. doi: 10.1002/cfg.435.

Graph-based analysis and visualization of experimental results with ONDEX.使用ONDEX对实验结果进行基于图形的分析和可视化。

Bioinformatics. 2006 Jun 1;22(11):1383-90. doi: 10.1093/bioinformatics/btl081. Epub 2006 Mar 13.

Law and order: assessing and enforcing compliance with ontological modeling principles in the Foundational Model of Anatomy.法律与秩序：评估并强制遵守解剖学基础模型中的本体建模原则

Comput Biol Med. 2006 Jul-Aug;36(7-8):674-93. doi: 10.1016/j.compbiomed.2005.04.007. Epub 2005 Sep 6.

Linking experimental results, biological networks and sequence analysis methods using Ontologies and Generalised Data Structures.使用本体和通用数据结构连接实验结果、生物网络和序列分析方法。

In Silico Biol. 2005;5(1):33-44.

Relations in biomedical ontologies.生物医学本体中的关系。

Genome Biol. 2005;6(5):R46. doi: 10.1186/gb-2005-6-5-r46. Epub 2005 Apr 28.

Mistakes in medical ontologies: where do they come from and how can they be detected?医学本体中的错误：它们从何而来以及如何被检测到？

Stud Health Technol Inform. 2004;102:145-63.

Implications of compositionality in the gene ontology for its curation and usage.基因本体中组合性对其管理与应用的影响。

Pac Symp Biocomput. 2005:174-85.

The role of foundational relations in the alignment of biomedical ontologies.基础关系在生物医学本体对齐中的作用。

Stud Health Technol Inform. 2004;107(Pt 1):444-8.

The compositional structure of Gene Ontology terms.基因本体术语的组成结构。

Pac Symp Biocomput. 2004:214-25. doi: 10.1142/9789812704856_0021.

Building mouse phenotype ontologies.构建小鼠表型本体论。

Pac Symp Biocomput. 2004:178-89. doi: 10.1142/9789812704856_0018.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

本体和分类法中术语和定义的质量控制。

Quality control for terms and definitions in ontologies and taxonomies.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献