Suppr超能文献

本体和分类法中术语和定义的质量控制。

Quality control for terms and definitions in ontologies and taxonomies.

作者信息

Köhler Jacob, Munn Katherine, Rüegg Alexander, Skusa Andre, Smith Barry

机构信息

Biomathematics and Bioinformatics Division, Rothamsted Research, Harpenden, UK.

出版信息

BMC Bioinformatics. 2006 Apr 19;7:212. doi: 10.1186/1471-2105-7-212.

Abstract

BACKGROUND

Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. A series of recent papers has shown that the Gene Ontology (GO), the most prominent taxonomic resource in these fields, is marked by flaws of certain characteristic types, which flow from a failure to address basic ontological principles. As yet, no methods have been proposed which would allow ontology curators to pinpoint flawed terms or definitions in ontologies in a systematic way.

RESULTS

We present computational methods that automatically identify terms and definitions which are defined in a circular or unintelligible way. We further demonstrate the potential of these methods by applying them to isolate a subset of 6001 problematic GO terms. By automatically aligning GO with other ontologies and taxonomies we were able to propose alternative synonyms and definitions for some of these problematic terms. This allows us to demonstrate that these other resources do not contain definitions superior to those supplied by GO.

CONCLUSION

Our methods provide reliable indications of the quality of terms and definitions in ontologies and taxonomies. Further, they are well suited to assist ontology curators in drawing their attention to those terms that are ill-defined. We have further shown the limitations of ontology mapping and alignment in assisting ontology curators in rectifying problems, thus pointing to the need for manual curation.

摘要

背景

本体论和分类法是分子生物学和生物信息学中最重要的计算资源之一。最近的一系列论文表明,基因本体论(GO)作为这些领域中最突出的分类资源,存在某些特定类型的缺陷,这些缺陷源于未能遵循基本的本体论原则。到目前为止,尚未提出能够使本体论管理者系统地找出本体论中有缺陷的术语或定义的方法。

结果

我们提出了一些计算方法,这些方法可以自动识别以循环或难以理解的方式定义的术语和定义。我们通过将这些方法应用于分离出6001个有问题的GO术语的子集,进一步展示了这些方法的潜力。通过将GO与其他本体论和分类法自动对齐,我们能够为其中一些有问题的术语提出替代同义词和定义。这使我们能够证明,这些其他资源并不包含比GO提供的定义更好的定义。

结论

我们的方法为本体论和分类法中术语和定义的质量提供了可靠的指示。此外,它们非常适合帮助本体论管理者关注那些定义不明确的术语。我们还展示了本体映射和对齐在帮助本体论管理者纠正问题方面的局限性,从而指出了人工编目的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d6b/1482721/876ac6fa8322/1471-2105-7-212-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验