Cui Licong
Department of EECS, Case Western Reserve University, Cleveland, OH Division of Medical Informatics, Case Western Reserve University, Cleveland, OH.
AMIA Annu Symp Proc. 2015 Nov 5;2015:456-65. eCollection 2015.
Biomedical ontologies play a vital role in healthcare information management, data integration, and decision support. Ontology quality assurance (OQA) is an indispensable part of the ontology engineering cycle. Most existing OQA methods are based on the knowledge provided within the targeted ontology. This paper proposes a novel cross-ontology analysis method, Cross-Ontology Hierarchical Relation Examination (COHeRE), to detect inconsistencies and possible errors in hierarchical relations across multiple ontologies. COHeRE leverages the Unified Medical Language System (UMLS) knowledge source and the MapReduce cloud computing technique for systematic, large-scale ontology quality assurance work. COHeRE consists of three main steps with the UMLS concepts and relations as the input. First, the relations claimed in source vocabularies are filtered and aggregated for each pair of concepts. Second, inconsistent relations are detected if a concept pair is related by different types of relations in different source vocabularies. Finally, the uncovered inconsistent relations are voted according to their number of occurrences across different source vocabularies. The voting result together with the inconsistent relations serve as the output of COHeRE for possible ontological change. The highest votes provide initial suggestion on how such inconsistencies might be fixed. In UMLS, 138,987 concept pairs were found to have inconsistent relationships across multiple source vocabularies. 40 inconsistent concept pairs involving hierarchical relationships were randomly selected and manually reviewed by a human expert. 95.8% of the inconsistent relations involved in these concept pairs indeed exist in their source vocabularies rather than being introduced by mistake in the UMLS integration process. 73.7% of the concept pairs with suggested relationship were agreed by the human expert. The effectiveness of COHeRE indicates that UMLS provides a promising environment to enhance qualities of biomedical ontologies by performing cross-ontology examination.
生物医学本体在医疗信息管理、数据集成和决策支持中发挥着至关重要的作用。本体质量保证(OQA)是本体工程周期中不可或缺的一部分。大多数现有的OQA方法是基于目标本体内提供的知识。本文提出了一种新颖的跨本体分析方法,即跨本体层次关系检查(COHeRE),以检测多个本体之间层次关系中的不一致性和可能的错误。COHeRE利用统一医学语言系统(UMLS)知识源和MapReduce云计算技术来进行系统的大规模本体质量保证工作。COHeRE以UMLS概念和关系为输入,包括三个主要步骤。首先,针对每对概念,对源词汇表中声明的关系进行筛选和汇总。其次,如果一个概念对在不同源词汇表中通过不同类型的关系相关联,则检测到不一致的关系。最后,根据不同源词汇表中出现的次数对发现的不一致关系进行投票。投票结果与不一致关系一起作为COHeRE的输出,用于可能的本体变更。得票最高的结果为如何修复此类不一致性提供了初步建议。在UMLS中,发现有138,987个概念对在多个源词汇表中存在不一致的关系。随机选择了40个涉及层次关系的不一致概念对,并由一位人类专家进行人工审核。这些概念对中涉及的不一致关系有95.8%确实存在于其源词汇表中,而不是在UMLS集成过程中错误引入的。人类专家同意73.7%的具有建议关系的概念对。COHeRE的有效性表明,UMLS通过进行跨本体检查为提高生物医学本体的质量提供了一个有前景的环境。