Suppr超能文献

利用知识图谱嵌入技术支持SNOMED CT后置协调。

Supporting SNOMED CT postcoordination with knowledge graph embeddings.

作者信息

Castell-Díaz Javier, Miñarro-Giménez Jose Antonio, Martínez-Costa Catalina

机构信息

Dept. Informatica y Sistemas, Universidad de Murcia, IMIB-Arrixaca, Murcia, Spain.

Dept. Informatica y Sistemas, Universidad de Murcia, IMIB-Arrixaca, Murcia, Spain.

出版信息

J Biomed Inform. 2023 Mar;139:104297. doi: 10.1016/j.jbi.2023.104297. Epub 2023 Feb 1.

Abstract

SNOMED CT postcoordination is an underused mechanism that can help to implement advanced systems for the automatic extraction and encoding of clinical information from text. It allows defining non-existing SNOMED CT concepts by their relationships with existing ones. Manually building postcoordinated expressions is a difficult task. It requires a deep knowledge of the terminology and the support of specialized tools that barely exist. In order to support the building of postcoordinated expressions, we have implemented KGE4SCT: a method that suggests the corresponding SNOMED CT postcoordinated expression for a given clinical term. We leverage on the SNOMED CT ontology and its graph-like structure and use knowledge graph embeddings (KGEs). The objective of such embeddings is to represent in a vector space knowledge graph components (e.g. entities and relations) in a way that captures the structure of the graph. Then, we use vector similarity and analogies for obtaining the postcoordinated expression of a given clinical term. We obtained a semantic type accuracy of 98%, relationship accuracy of 90%, and analogy accuracy of 60%, with an overall completeness of postcoordination of 52% for the Spanish SNOMED CT version. We have also applied it to the English SNOMED CT version and outperformed state of the art methods in both, corpus generation for language model training for this task (improvement of 6% for analogy accuracy), and automatic postcoordination of SNOMED CT expressions, with an increase of 17% for partial conversion rate.

摘要

SNOMED CT后置协调是一种未得到充分利用的机制,它有助于实现用于从文本中自动提取和编码临床信息的先进系统。它允许通过与现有概念的关系来定义不存在的SNOMED CT概念。手动构建后置协调表达式是一项艰巨的任务。它需要对术语有深入的了解以及几乎不存在的专业工具的支持。为了支持后置协调表达式的构建,我们实现了KGE4SCT:一种为给定临床术语建议相应SNOMED CT后置协调表达式的方法。我们利用SNOMED CT本体及其类似图的结构,并使用知识图谱嵌入(KGE)。这种嵌入的目的是以捕获图结构的方式在向量空间中表示知识图谱组件(例如实体和关系)。然后,我们使用向量相似度和类比来获取给定临床术语的后置协调表达式。对于西班牙语SNOMED CT版本,我们获得了98%的语义类型准确率、90%的关系准确率和60%的类比准确率,后置协调的总体完整性为52%。我们还将其应用于英语SNOMED CT版本,在用于此任务的语言模型训练的语料库生成(类比准确率提高6%)和SNOMED CT表达式的自动后置协调方面均优于现有方法,部分转化率提高了17%。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验