Laboratory for Systems, Software and Semantics (LS3), Ryerson University, Ontario, Canada.
Faculty of Organizational Sciences (FOS), University of Belgrade, Belgrade, Serbia.
J Am Med Inform Assoc. 2018 Jul 1;25(7):819-826. doi: 10.1093/jamia/ocy021.
The goal of this work is to map Unified Medical Language System (UMLS) concepts to DBpedia resources using widely accepted ontology relations from the Simple Knowledge Organization System (skos:exactMatch, skos:closeMatch) and from the Resource Description Framework Schema (rdfs:seeAlso), as a result of which a complete mapping from UMLS (UMLS 2016AA) to DBpedia (DBpedia 2015-10) is made publicly available that includes 221 690 skos:exactMatch, 26 276 skos:closeMatch, and 6 784 322 rdfs:seeAlso mappings.
We propose a method called circular resolution that utilizes a combination of semantic annotators to map UMLS concepts to DBpedia resources. A set of annotators annotate definitions of UMLS concepts returning DBpedia resources while another set performs annotation on DBpedia resource abstracts returning UMLS concepts. Our pipeline aligns these 2 sets of annotations to determine appropriate mappings from UMLS to DBpedia.
We evaluate our proposed method using structured data from the Wikidata knowledge base as the ground truth, which consists of 4899 already existing UMLS to DBpedia mappings. Our results show an 83% recall with 77% precision-at-one (P@1) in mapping UMLS concepts to DBpedia resources on this testing set.
The proposed circular resolution method is a simple yet effective technique for linking UMLS concepts to DBpedia resources. Experiments using Wikidata-based ground truth reveal a high mapping accuracy. In addition to the complete UMLS mapping downloadable in n-triple format, we provide an online browser and a RESTful service to explore the mappings.
本工作旨在使用来自简单知识组织系统(skos:exactMatch、skos:closeMatch)和资源描述框架模式(rdfs:seeAlso)的广泛接受的本体关系,将统一医学语言系统(UMLS)概念映射到 DBpedia 资源,从而生成 UMLS(UMLS 2016AA)到 DBpedia(DBpedia 2015-10)的完整映射,其中包括 221690 个 skos:exactMatch、26276 个 skos:closeMatch 和 6784322 个 rdfs:seeAlso 映射。
我们提出了一种称为循环解析的方法,该方法利用语义标注器的组合将 UMLS 概念映射到 DBpedia 资源。一组标注器注释 UMLS 概念的定义,返回 DBpedia 资源,而另一组标注器对 DBpedia 资源摘要进行注释,返回 UMLS 概念。我们的管道对齐这 2 组注释,以确定从 UMLS 到 DBpedia 的适当映射。
我们使用 Wikidata 知识库的结构化数据作为地面真实来评估我们提出的方法,其中包含 4899 个已经存在的 UMLS 到 DBpedia 的映射。我们的结果显示,在这个测试集上,将 UMLS 概念映射到 DBpedia 资源的召回率为 83%,精度为 77%(P@1)。
所提出的循环解析方法是将 UMLS 概念链接到 DBpedia 资源的简单而有效的技术。使用基于 Wikidata 的地面真实进行的实验显示出较高的映射准确性。除了可下载为 n-三元格式的完整 UMLS 映射外,我们还提供了一个在线浏览器和一个 RESTful 服务来探索映射。