Suppr超能文献

学习统一医学语言系统知识嵌入对生物医学文本中关系抽取的影响。

The impact of learning Unified Medical Language System knowledge embeddings in relation extraction from biomedical texts.

机构信息

Human Language Technology Research Institute, Department of Computer Science, Erik Jonsson School of Engineering & Computer Science, University of Texas at Dallas, Richardson, Texas, USA.

出版信息

J Am Med Inform Assoc. 2020 Oct 1;27(10):1556-1567. doi: 10.1093/jamia/ocaa205.

Abstract

OBJECTIVE

We explored how knowledge embeddings (KEs) learned from the Unified Medical Language System (UMLS) Metathesaurus impact the quality of relation extraction on 2 diverse sets of biomedical texts.

MATERIALS AND METHODS

Two forms of KEs were learned for concepts and relation types from the UMLS Metathesaurus, namely lexicalized knowledge embeddings (LKEs) and unlexicalized KEs. A knowledge embedding encoder (KEE) enabled learning either LKEs or unlexicalized KEs as well as neural models capable of producing LKEs for mentions of biomedical concepts in texts and relation types that are not encoded in the UMLS Metathesaurus. This allowed us to design the relation extraction with knowledge embeddings (REKE) system, which incorporates either LKEs or unlexicalized KEs produced for relation types of interest and their arguments.

RESULTS

The incorporation of either LKEs or unlexicalized KE in REKE advances the state of the art in relation extraction on 2 relation extraction datasets: the 2010 i2b2/VA dataset and the 2013 Drug-Drug Interaction Extraction Challenge corpus. Moreover, the impact of LKEs is superior, achieving F1 scores of 78.2 and 82.0, respectively.

DISCUSSION

REKE not only highlights the importance of incorporating knowledge encoded in the UMLS Metathesaurus in a novel way, through 2 possible forms of KEs, but it also showcases the subtleties of incorporating KEs in relation extraction systems.

CONCLUSIONS

Incorporating LKEs informed by the UMLS Metathesaurus in a relation extraction system operating on biomedical texts shows significant promise. We present the REKE system, which establishes new state-of-the-art results for relation extraction on 2 datasets when using LKEs.

摘要

目的

我们探讨了从统一医学语言系统(UMLS)Metathesaurus 中学到的知识嵌入(KE)如何影响在 2 个不同的生物医学文本集上进行关系提取的质量。

材料与方法

从 UMLS Metathesaurus 中为概念和关系类型学习了两种形式的 KE,即词汇化知识嵌入(LKE)和非词汇化 KE。知识嵌入编码器(KEE)可以学习 LKE 或非词汇化 KE 以及能够为文本中生物医学概念的提及和 UMLS Metathesaurus 中未编码的关系类型生成 LKE 的神经模型。这使我们能够设计带有知识嵌入的关系提取(REKE)系统,该系统结合了为感兴趣的关系类型及其参数生成的 LKE 或非词汇化 KE。

结果

在 2 个关系提取数据集(2010 年 i2b2/VA 数据集和 2013 年药物相互作用提取挑战赛语料库)上,REKE 中包含 LKE 或非词汇化 KE 可提高关系提取的最新水平。此外,LKE 的影响更为优越,分别达到了 78.2 和 82.0 的 F1 分数。

讨论

REKE 不仅通过 2 种可能的 KE 形式突出了以新颖方式纳入 UMLS Metathesaurus 中编码知识的重要性,而且还展示了在关系提取系统中纳入 KE 的细微差别。

结论

在生物医学文本上运行的关系提取系统中纳入由 UMLS Metathesaurus 提供的 LKE 显示出巨大的潜力。我们提出了 REKE 系统,当使用 LKE 时,该系统在 2 个数据集上的关系提取中确立了新的最新水平。

相似文献

8
Improving medical term embeddings using UMLS Metathesaurus.利用 UMLS 语义学术语表改进医学术语嵌入。
BMC Med Inform Decis Mak. 2022 Apr 29;22(1):114. doi: 10.1186/s12911-022-01850-5.

本文引用的文献

9
2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.2010 i2b2/VA 挑战赛:临床文本中的概念、断言和关系
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6. doi: 10.1136/amiajnl-2011-000203. Epub 2011 Jun 16.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验