Suppr超能文献

一种大规模本体匹配的有效方法。

An effective method of large scale ontology matching.

作者信息

Diallo Gayo

机构信息

University Bordeaux, ISPED, Centre INSERM U897, F-33000 Bordeaux, France.

出版信息

J Biomed Semantics. 2014 Oct 28;5(1):44. doi: 10.1186/2041-1480-5-44. eCollection 2014.

Abstract

BACKGROUND

We are currently facing a proliferation of heterogeneous biomedical data sources accessible through various knowledge-based applications. These data are annotated by increasingly extensive and widely disseminated knowledge organisation systems ranging from simple terminologies and structured vocabularies to formal ontologies. In order to solve the interoperability issue, which arises due to the heterogeneity of these ontologies, an alignment task is usually performed. However, while significant effort has been made to provide tools that automatically align small ontologies containing hundreds or thousands of entities, little attention has been paid to the matching of large sized ontologies in the life sciences domain.

RESULTS

We have designed and implemented ServOMap, an effective method for large scale ontology matching. It is a fast and efficient high precision system able to perform matching of input ontologies containing hundreds of thousands of entities. The system, which was included in the 2012 and 2013 editions of the Ontology Alignment Evaluation Initiative campaign, performed very well. It was ranked among the top systems for the large ontologies matching.

CONCLUSIONS

We proposed an approach for large scale ontology matching relying on Information Retrieval (IR) techniques and the combination of lexical and machine learning contextual similarity computing for the generation of candidate mappings. It is particularly adapted to the life sciences domain as many of the ontologies in this domain benefit from synonym terms taken from the Unified Medical Language System and that can be used by our IR strategy. The ServOMap system we implemented is able to deal with hundreds of thousands entities with an efficient computation time.

摘要

背景

我们目前面临着大量通过各种基于知识的应用程序可访问的异构生物医学数据源。这些数据由越来越广泛和广泛传播的知识组织系统进行注释,范围从简单的术语和结构化词汇到形式本体。为了解决由于这些本体的异构性而产生的互操作性问题,通常会执行对齐任务。然而,尽管已经做出了巨大努力来提供能够自动对齐包含数百或数千个实体的小型本体的工具,但对于生命科学领域中大型本体的匹配却很少受到关注。

结果

我们设计并实现了ServOMap,一种用于大规模本体匹配的有效方法。它是一个快速高效的高精度系统,能够对包含数十万实体的输入本体进行匹配。该系统被纳入2012年和2013年版的本体对齐评估倡议活动中,表现非常出色。在大型本体匹配方面,它被列为顶级系统之一。

结论

我们提出了一种基于信息检索(IR)技术以及词汇和机器学习上下文相似性计算相结合的大规模本体匹配方法,用于生成候选映射。它特别适用于生命科学领域,因为该领域的许多本体受益于取自统一医学语言系统的同义词,并且可以被我们的IR策略使用。我们实现的ServOMap系统能够在高效的计算时间内处理数十万个实体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b45a/4236493/5eeafe35b617/13326_2013_189_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验