Suppr超能文献

NeighBERT:使用关系诱导密集检索的医学实体链接

NeighBERT: Medical Entity Linking Using Relation-Induced Dense Retrieval.

作者信息

Singh Ayush, Krishnamoorthy Saranya, Ortega John E

机构信息

inQbator AI, Evernorth Health Services, Saint Louis, MO USA.

出版信息

J Healthc Inform Res. 2024 Jan 18;8(2):353-369. doi: 10.1007/s41666-023-00136-3. eCollection 2024 Jun.

Abstract

UNLABELLED

One of the common tasks in clinical natural language processing is medical entity linking (MEL) which involves mention detection followed by linking the mention to an entity in a knowledge base. One reason that MEL has not been solved is due to a problem that occurs in language where ambiguous texts can be resolved to several named entities. This problem is exacerbated when processing the text found in electronic health records. Recent work has shown that deep learning models based on transformers outperform previous methods on linking at higher rates of performance. We introduce NeighBERT, a custom pre-training technique which extends BERT (Devlin et al [1]) by encoding how entities are related within a knowledge graph. This technique adds relational context that has been traditionally missing in original BERT, helping resolve the ambiguity found in clinical text. In our experiments, NeighBERT improves the precision, recall, and F1-score of the state of the art by 1-3 points for named entity recognition and 10-15 points for MEL on two widely known clinical datasets.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s41666-023-00136-3.

摘要

未标注

临床自然语言处理中的常见任务之一是医学实体链接(MEL),它包括提及检测,然后将提及链接到知识库中的一个实体。MEL尚未得到解决的一个原因是语言中存在的一个问题,即模糊文本可以解析为多个命名实体。在处理电子健康记录中的文本时,这个问题会更加严重。最近的工作表明,基于Transformer的深度学习模型在链接方面的性能优于以前的方法。我们引入了NeighBERT,一种定制的预训练技术,它通过编码知识图中实体之间的关系来扩展BERT(Devlin等人[1])。这种技术添加了原始BERT中传统上缺失的关系上下文,有助于解决临床文本中发现的模糊性。在我们的实验中,在两个广为人知的临床数据集上,NeighBERT将命名实体识别的精确率、召回率和F1分数提高了1-3分,将MEL的相应指标提高了10-15分。

补充信息

在线版本包含可在10.1007/s41666-023-00136-3获取的补充材料。

相似文献

1
NeighBERT: Medical Entity Linking Using Relation-Induced Dense Retrieval.
J Healthc Inform Res. 2024 Jan 18;8(2):353-369. doi: 10.1007/s41666-023-00136-3. eCollection 2024 Jun.
2
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.
BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.
3
Korean clinical entity recognition from diagnosis text using BERT.
BMC Med Inform Decis Mak. 2020 Sep 30;20(Suppl 7):242. doi: 10.1186/s12911-020-01241-8.
4
Evaluating Medical Entity Recognition in Health Care: Entity Model Quantitative Study.
JMIR Med Inform. 2024 Oct 17;12:e59782. doi: 10.2196/59782.
5
Incorporating entity-level knowledge in pretrained language model for biomedical dense retrieval.
Comput Biol Med. 2023 Nov;166:107535. doi: 10.1016/j.compbiomed.2023.107535. Epub 2023 Sep 28.
6
Deep learning with language models improves named entity recognition for PharmaCoNER.
BMC Bioinformatics. 2021 Dec 17;22(Suppl 1):602. doi: 10.1186/s12859-021-04260-y.
8
Analyzing transfer learning impact in biomedical cross-lingual named entity recognition and normalization.
BMC Bioinformatics. 2021 Dec 17;22(Suppl 1):601. doi: 10.1186/s12859-021-04247-9.
9
Automatic knowledge extraction from Chinese electronic medical records and rheumatoid arthritis knowledge graph construction.
Quant Imaging Med Surg. 2023 Jun 1;13(6):3873-3890. doi: 10.21037/qims-22-1158. Epub 2023 May 8.
10
Language model based on deep learning network for biomedical named entity recognition.
Methods. 2024 Jun;226:71-77. doi: 10.1016/j.ymeth.2024.04.013. Epub 2024 Apr 17.

本文引用的文献

1
Improving broad-coverage medical entity linking with semantic type prediction and large-scale datasets.
J Biomed Inform. 2021 Sep;121:103880. doi: 10.1016/j.jbi.2021.103880. Epub 2021 Aug 12.
2
Multi-domain clinical natural language processing with MedCAT: The Medical Concept Annotation Toolkit.
Artif Intell Med. 2021 Jul;117:102083. doi: 10.1016/j.artmed.2021.102083. Epub 2021 May 1.
3
4
Clinical concept extraction using transformers.
J Am Med Inform Assoc. 2020 Dec 9;27(12):1935-1942. doi: 10.1093/jamia/ocaa189.
5
Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts.
J Am Med Inform Assoc. 2020 Oct 1;27(10):1538-1546. doi: 10.1093/jamia/ocaa136.
6
Clinical concept extraction: A methodology review.
J Biomed Inform. 2020 Sep;109:103526. doi: 10.1016/j.jbi.2020.103526. Epub 2020 Aug 6.
8
BERT-based Ranking for Biomedical Entity Normalization.
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:269-277. eCollection 2020.
10
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验