Suppr超能文献

概念嵌入用于测量生物医学信息本体的语义相似度。

Concept embedding to measure semantic relatedness for biomedical information ontologies.

机构信息

Department of Bio and Brain Engineering, KAIST, Daejeon, Republic of Korea.

Milner Therapeutics Institute University of Cambridge, Cambridge CB2 1TN, UK.

出版信息

J Biomed Inform. 2019 Jun;94:103182. doi: 10.1016/j.jbi.2019.103182. Epub 2019 Apr 19.

Abstract

There have been many attempts to identify relationships among concepts corresponding to terms from biomedical information ontologies such as the Unified Medical Language System (UMLS). In particular, vector representation of such concepts using information from UMLS definition texts is widely used to measure the relatedness between two biological concepts. However, conventional relatedness measures have a limited range of applicable word coverage, which limits the performance of these models. In this paper, we propose a concept-embedding model of a UMLS semantic relatedness measure to overcome the limitations of earlier models. We obtained context texts of biological concepts that are not defined in UMLS by utilizing Wikipedia as an external knowledgebase. Concept vector representations were then derived from the context texts of the biological concepts. The degree of relatedness between two concepts was defined as the cosine similarity between corresponding concept vectors. As a result, we validated that our method provides higher coverage and better performance than the conventional method.

摘要

已经有许多尝试来识别对应于生物医学信息本体论(如统一医学语言系统(UMLS))术语的概念之间的关系。特别是,使用 UMLS 定义文本中的信息来表示这种概念的向量表示形式被广泛用于测量两个生物概念之间的相关性。然而,传统的相关性度量具有有限的适用词覆盖范围,这限制了这些模型的性能。在本文中,我们提出了一种 UMLS 语义相关性度量的概念嵌入模型,以克服早期模型的局限性。我们通过利用维基百科作为外部知识库,获得了 UMLS 中未定义的生物概念的上下文文本。然后从生物概念的上下文文本中推导出概念向量表示。两个概念之间的相关性程度被定义为对应概念向量之间的余弦相似度。结果表明,我们的方法比传统方法具有更高的覆盖率和更好的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验