分析SNOMED-CT中的句法规则与不规则性。

Analysing Syntactic Regularities and Irregularities in SNOMED-CT.

作者信息

Mikroyannidi Eleni, Stevens Robert, Iannone Luigi, Rector Alan

机构信息

School of Computer Science, The University of Manchester, Oxford Road, Manchester, M13 9PL UK.

出版信息

J Biomed Semantics. 2012 Dec 17;3(1):8. doi: 10.1186/2041-1480-3-8.

DOI:10.1186/2041-1480-3-8

PMID:23244503

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3637289/

Abstract

MOTIVATION

In this paper we demonstrate the usage of RIO; a framework for detecting syntactic regularities using cluster analysis of the entities in the signature of an ontology. Quality assurance in ontologies is vital for their use in real applications, as well as a complex and difficult task. It is also important to have such methods and tools when the ontology lacks documentation and the user cannot consult the ontology developers to understand its construction. One aspect of quality assurance is checking how well an ontology complies with established 'coding standards'; is the ontology regular in how descriptions of different types of entities are axiomatised? Is there a similar way to describe them and are there any corner cases that are not covered by a pattern? Detection of regularities and irregularities in axiom patterns should provide ontology authors and quality inspectors with a level of abstraction such that compliance to coding standards can be automated. However, there is a lack of such reverse ontology engineering methods and tools.

RESULTS

RIO framework allows regularities to be detected in an OWL ontology, i.e. repetitive structures in the axioms of an ontology. We describe the use of standard machine learning approaches to make clusters of similar entities and generalise over their axioms to find regularities. This abstraction allows matches to, and deviations from, an ontology's patterns to be shown. We demonstrate its usage with the inspection of three modules from SNOMED-CT, a large medical terminology, that cover "Present" and "Absent" findings, as well as "Chronic" and "Acute" findings. The module sizes are 5 065, 20 688 and 19 812 asserted axioms. They are analysed in terms of their types and number of regularities and irregularities in the asserted axioms of the ontology. The analysis showed that some modules of the terminology, which were expected to instantiate a pattern described in the SNOMED-CT technical guide, were found to have a high number of regularity deviations. A subset of these were categorised as "design defects" by verifying them with past work on the quality assurance of SNOMED-CT. These were mainly incomplete descriptions. In the worst case, the expected patterns described in the technical guide were followed by only 5% of the axioms in the module.

CONCLUSION

It is possible to automatically detect regularities and then inspect irregularities in an ontology. We argue that RIO is a tool to find and report such matches and mismatches, for evaluations by the domain experts. We have demonstrated that standard clustering techniques from machine learning can offer a tool in the drive for quality assurance in ontologies.

AVAILABILITY

http://riotool.sourceforge.net/

CONTACT

http://eleni.mikroyannidi@manchester.ac.uk, http://robert.stevens@manchehster.ac.uk.

摘要

动机

在本文中，我们展示了RIO的用法；RIO是一个通过对本体签名中的实体进行聚类分析来检测句法规则的框架。本体中的质量保证对于其在实际应用中的使用至关重要，同时也是一项复杂且困难的任务。当本体缺乏文档且用户无法咨询本体开发者以了解其构建方式时，拥有此类方法和工具也很重要。质量保证的一个方面是检查本体与既定“编码标准”的符合程度；本体在对不同类型实体的描述进行公理形式化时是否规则？是否有一种类似的方式来描述它们，是否存在任何模式未涵盖的特殊情况？检测公理模式中的规则性和不规则性应为本体作者和质量检查人员提供一定程度的抽象，以便能够自动检查是否符合编码标准。然而，目前缺乏此类反向本体工程方法和工具。

结果

RIO框架允许在OWL本体中检测规则性，即本体公理中的重复结构。我们描述了使用标准机器学习方法对相似实体进行聚类，并对其公理进行归纳以发现规则性。这种抽象允许展示与本体模式的匹配情况以及偏差。我们通过检查来自大型医学术语集SNOMED-CT的三个模块来演示其用法，这三个模块涵盖了“存在”和“不存在”的发现，以及“慢性”和“急性”的发现。模块大小分别为5065条、20688条和19812条断言公理。我们根据本体断言公理中的类型以及规则性和不规则性的数量对它们进行了分析。分析表明，该术语集的一些模块本应实例化SNOMED-CT技术指南中描述的一种模式，但却发现存在大量规则性偏差。通过将其中一部分与过去关于SNOMED-CT质量保证的工作进行验证，将其归类为“设计缺陷”。这些主要是不完整的描述。在最坏的情况下，技术指南中描述的预期模式在模块中仅被5%的公理遵循。

结论

可以自动检测本体中的规则性，然后检查其中的不规则性。我们认为RIO是一种用于查找和报告此类匹配和不匹配情况的工具，以供领域专家进行评估。我们已经证明，机器学习中的标准聚类技术可以为本体质量保证工作提供一种工具。

可用性

http://riotool.sourceforge.net/

联系方式

http://eleni.mikroyannidi@manchester.ac.uk，http://robert.stevens@manchehster.ac.uk。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/930e/3637289/13f4aad2b702/2041-1480-3-8-1.jpg

相似文献

Analysing Syntactic Regularities and Irregularities in SNOMED-CT.

J Biomed Semantics. 2012 Dec 17;3(1):8. doi: 10.1186/2041-1480-3-8.

From lexical regularities to axiomatic patterns for the quality assurance of biomedical terminologies and ontologies.

J Biomed Inform. 2018 Aug;84:59-74. doi: 10.1016/j.jbi.2018.06.008. Epub 2018 Jun 14.

Analysis of readability and structural accuracy in SNOMED CT.

BMC Med Inform Decis Mak. 2020 Dec 15;20(Suppl 10):284. doi: 10.1186/s12911-020-01291-y.

Suggesting Missing Relations in Biomedical Ontologies Based on Lexical Regularities.

Stud Health Technol Inform. 2016;228:384-8.

Formal axioms in biomedical ontologies improve analysis and interpretation of associated data.

Bioinformatics. 2020 Apr 1;36(7):2229-2236. doi: 10.1093/bioinformatics/btz920.

Prioritising lexical patterns to increase axiomatisation in biomedical ontologies. The role of localisation and modularity.

Methods Inf Med. 2015;54(1):56-64. doi: 10.3414/ME13-02-0026. Epub 2014 Jul 4.

A unified software framework for deriving, visualizing, and exploring abstraction networks for ontologies.

J Biomed Inform. 2016 Aug;62:90-105. doi: 10.1016/j.jbi.2016.06.008. Epub 2016 Jun 23.

Missing lateral relationships in top-level concepts of an ontology.

BMC Med Inform Decis Mak. 2020 Dec 15;20(Suppl 10):305. doi: 10.1186/s12911-020-01319-3.

Dione: An OWL representation of ICD-10-CM for classifying patients' diseases.

J Biomed Semantics. 2016 Oct 13;7(1):62. doi: 10.1186/s13326-016-0105-x.

Axiomatizing SNOMED CT Disorders: Should There Be Room for Interpretation?

Form Ontol Inf Syst. 2023;377:140-154. doi: 10.3233/FAIA231124.

引用本文的文献

Identifying Missing Hierarchical Relations in SNOMED CT from Logical Definitions Based on the Lexical Features of Concept Names.

CEUR Workshop Proc. 2016 Aug;1747.

Discovery of Emerging Design Patterns in Ontologies Using Tree Mining.

Semant Web. 2018;9(4):517-544. doi: 10.3233/SW-170280. Epub 2018 Jun 29.

Structural Patterns under X-Rays: Is SNOMED CT Growing Straight?

PLoS One. 2016 Nov 3;11(11):e0165619. doi: 10.1371/journal.pone.0165619. eCollection 2016.

The Proteasix Ontology.

J Biomed Semantics. 2016 Jun 4;7(1):33. doi: 10.1186/s13326-016-0078-9.

Automated mapping of clinical terms into SNOMED-CT. An application to codify procedures in pathology.

J Med Syst. 2014 Oct;38(10):134. doi: 10.1007/s10916-014-0134-x. Epub 2014 Sep 2.

本文引用的文献

Lexically suggest, logically define: quality assurance of the use of qualifiers and expected results of post-coordination in SNOMED CT.

J Biomed Inform. 2012 Apr;45(2):199-209. doi: 10.1016/j.jbi.2011.10.002. Epub 2011 Oct 14.

Abstraction of complex concepts with a refined partial-area taxonomy of SNOMED.

J Biomed Inform. 2012 Feb;45(1):15-29. doi: 10.1016/j.jbi.2011.08.013. Epub 2011 Aug 25.

Developing a kidney and urinary pathway knowledge base.

J Biomed Semantics. 2011 May 17;2 Suppl 2(Suppl 2):S7. doi: 10.1186/2041-1480-2-S2-S7.

Getting the foot out of the pelvis: modeling problems affecting use of SNOMED CT hierarchies in practical applications.

J Am Med Inform Assoc. 2011 Jul-Aug;18(4):432-40. doi: 10.1136/amiajnl-2010-000045. Epub 2011 Apr 21.

Relationship groups in SNOMED CT.

Stud Health Technol Inform. 2009;150:223-7.

Why do it the hard way? The case for an expressive description logic for SNOMED.

J Am Med Inform Assoc. 2008 Nov-Dec;15(6):744-51. doi: 10.1197/jamia.M2797. Epub 2008 Aug 28.

Structural methodologies for auditing SNOMED.

J Biomed Inform. 2007 Oct;40(5):561-81. doi: 10.1016/j.jbi.2006.12.003. Epub 2006 Dec 24.

Bio-ontologies: current trends and future directions.

Brief Bioinform. 2006 Sep;7(3):256-74. doi: 10.1093/bib/bbl027. Epub 2006 Aug 9.

Ontology-based error detection in SNOMED-CT.

Stud Health Technol Inform. 2004;107(Pt 1):482-6.

Role grouping as an extension to the description logic of Ontylog, motivated by concept modeling in SNOMED.

Proc AMIA Symp. 2002:712-6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

分析SNOMED-CT中的句法规则与不规则性。

Analysing Syntactic Regularities and Irregularities in SNOMED-CT.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

CONCLUSION

AVAILABILITY

CONTACT

动机

结果

结论

可用性

联系方式

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献