知识注入的跨语言医学术语嵌入用于术语归一化。

CODER: Knowledge-infused cross-lingual medical term embedding for term normalization.

机构信息

Center for Statistical Science, Tsinghua University, Beijing, China; Department of Industrial Engineering, Tsinghua University, Beijing, China.

Institute of Medical Information, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China.

出版信息

J Biomed Inform. 2022 Feb;126:103983. doi: 10.1016/j.jbi.2021.103983. Epub 2022 Jan 4.

DOI:10.1016/j.jbi.2021.103983

PMID:34990838

Abstract

OBJECTIVE

This paper aims to propose knowledge-aware embedding, a critical tool for medical term normalization.

METHODS

We develop CODER (Cross-lingual knowledge-infused medical term embedding) via contrastive learning based on a medical knowledge graph (KG) named the Unified Medical Language System, and similarities are calculated utilizing both terms and relation triplets from the KG. Training with relations injects medical knowledge into embeddings and can potentially improve their performance as machine learning features.

RESULTS

We evaluate CODER based on zero-shot term normalization, semantic similarity, and relation classification benchmarks, and the results show that CODER outperforms various state-of-the-art biomedical word embeddings, concept embeddings, and contextual embeddings.

CONCLUSION

CODER embeddings excellently reflect semantic similarity and relatedness of medical concepts. One can use CODER for embedding-based medical term normalization or to provide features for machine learning. Similar to other pretrained language models, CODER can also be fine-tuned for specific tasks. Codes and models are available at https://github.com/GanjinZero/CODER.

摘要

目的

本文旨在提出知识感知嵌入，这是医学术语规范化的重要工具。

方法

我们通过基于医学知识图谱（名为统一医学语言系统的 KG）的对比学习来开发 CODER（跨语言知识注入的医学术语嵌入），并利用 KG 中的术语和关系三元组来计算相似度。利用关系进行训练将医学知识注入到嵌入中，从而有可能提高它们作为机器学习特征的性能。

结果

我们基于零镜头术语规范化、语义相似性和关系分类基准来评估 CODER，结果表明 CODER 优于各种最先进的生物医学词嵌入、概念嵌入和上下文嵌入。

结论

CODER 嵌入极好地反映了医学概念的语义相似性和相关性。可以将 CODER 用于基于嵌入的医学术语规范化，或为机器学习提供特征。与其他预训练的语言模型类似，CODER 也可以针对特定任务进行微调。代码和模型可在 https://github.com/GanjinZero/CODER 上获得。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

知识注入的跨语言医学术语嵌入用于术语归一化。

CODER: Knowledge-infused cross-lingual medical term embedding for term normalization.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

知识注入的跨语言医学术语嵌入用于术语归一化。

CODER: Knowledge-infused cross-lingual medical term embedding for term normalization.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献