BertMCN：使用 BERT 和高速公路网络将俗语映射到标准医学概念。

BertMCN: Mapping colloquial phrases to standard medical concepts using BERT and highway network.

机构信息

Text Analytics and NLP Lab, Department of Computer Applications, NIT Trichy, India.

出版信息

Artif Intell Med. 2021 Feb;112:102008. doi: 10.1016/j.artmed.2021.102008. Epub 2021 Jan 7.

DOI:10.1016/j.artmed.2021.102008

Abstract

In the last few years, people started to share lots of information related to health in the form of tweets, reviews and blog posts. All these user generated clinical texts can be mined to generate useful insights. However, automatic analysis of clinical text requires identification of standard medical concepts. Most of the existing deep learning based medical concept normalization systems are based on CNN or RNN. Performance of these models is limited as they have to be trained from scratch (except embeddings). In this work, we propose a medical concept normalization system based on BERT and highway layer. BERT, a pre-trained context sensitive deep language representation model advanced state-of-the-art performance in many NLP tasks and gating mechanism in highway layer helps the model to choose only important information. Experimental results show that our model outperformed all existing methods on two standard datasets. Further, we conduct a series of experiments to study the impact of different learning rates and batch sizes, noise and freezing encoder layers on our model.

摘要

在过去的几年中，人们开始以微博、评论和博客文章的形式分享大量与健康相关的信息。所有这些用户生成的临床文本都可以被挖掘出来以生成有用的见解。然而，临床文本的自动分析需要识别标准的医疗概念。现有的大多数基于深度学习的医学概念规范化系统都是基于 CNN 或 RNN。由于这些模型必须从头开始训练（除了嵌入），因此它们的性能受到限制。在这项工作中，我们提出了一种基于 BERT 和高速公路层的医学概念规范化系统。BERT 是一种预先训练的上下文敏感的深度语言表示模型，在许多 NLP 任务中都取得了先进的性能，而高速公路层中的门控机制则帮助模型只选择重要的信息。实验结果表明，我们的模型在两个标准数据集上的表现优于所有现有的方法。此外，我们还进行了一系列实验，研究了不同的学习率和批量大小、噪声和冻结编码器层对我们模型的影响。

相似文献

BertMCN: Mapping colloquial phrases to standard medical concepts using BERT and highway network.BertMCN：使用 BERT 和高速公路网络将俗语映射到标准医学概念。

Artif Intell Med. 2021 Feb;112:102008. doi: 10.1016/j.artmed.2021.102008. Epub 2021 Jan 7.

Unified Medical Language System resources improve sieve-based generation and Bidirectional Encoder Representations from Transformers (BERT)-based ranking for concept normalization.统一医学语言系统资源提高了基于筛子的生成和基于双向编码器表示的转换器（BERT）的排名，以实现概念归一化。

J Am Med Inform Assoc. 2020 Oct 1;27(10):1510-1519. doi: 10.1093/jamia/ocaa080.

BERT-based Ranking for Biomedical Entity Normalization.基于BERT的生物医学实体规范化排序

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:269-277. eCollection 2020.

Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

GT-Finder: Classify the family of glucose transporters with pre-trained BERT language models.GT-Finder：使用预训练的 BERT 语言模型对葡萄糖转运蛋白家族进行分类。

Comput Biol Med. 2021 Apr;131:104259. doi: 10.1016/j.compbiomed.2021.104259. Epub 2021 Feb 7.

A Question-and-Answer System to Extract Data From Free-Text Oncological Pathology Reports (CancerBERT Network): Development Study.从自由文本肿瘤病理学报告（CancerBERT 网络）中提取数据的问答系统：开发研究。

J Med Internet Res. 2022 Mar 23;24(3):e27210. doi: 10.2196/27210.

Comparing deep learning architectures for sentiment analysis on drug reviews.比较药物评论情感分析的深度学习架构。

J Biomed Inform. 2020 Oct;110:103539. doi: 10.1016/j.jbi.2020.103539. Epub 2020 Aug 17.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT：一种用于生物医学文本挖掘的预训练生物医学语言表示模型。

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

Relation Extraction from Clinical Narratives Using Pre-trained Language Models.使用预训练语言模型从临床叙述中提取关系

AMIA Annu Symp Proc. 2020 Mar 4;2019:1236-1245. eCollection 2019.

Extracting Multiple Worries From Breast Cancer Patient Blogs Using Multilabel Classification With the Natural Language Processing Model Bidirectional Encoder Representations From Transformers: Infodemiology Study of Blogs.使用基于Transformer的自然语言处理模型双向编码器表征的多标签分类从乳腺癌患者博客中提取多种担忧：博客的信息流行病学研究

JMIR Cancer. 2022 Jun 3;8(2):e37840. doi: 10.2196/37840.

引用本文的文献

Unsupervised SapBERT-based bi-encoders for medical concept annotation of clinical narratives with SNOMED CT.基于无监督SapBERT的双编码器，用于使用SNOMED CT对临床叙述进行医学概念注释。

Digit Health. 2024 Oct 21;10:20552076241288681. doi: 10.1177/20552076241288681. eCollection 2024 Jan-Dec.

NSSC: a neuro-symbolic AI system for enhancing accuracy of named entity recognition and linking from oncologic clinical notes.NSSC：一种用于提高肿瘤临床记录中命名实体识别和链接准确性的神经符号人工智能系统。

Med Biol Eng Comput. 2025 Mar;63(3):749-772. doi: 10.1007/s11517-024-03227-4. Epub 2024 Nov 1.

Impact of word embedding models on text analytics in deep learning environment: a review.词嵌入模型对深度学习环境下文本分析的影响：综述

Artif Intell Rev. 2023 Feb 22:1-81. doi: 10.1007/s10462-023-10419-1.

Identifying COVID-19 english informative tweets using limited labelled data.使用有限的标注数据识别关于新冠疫情的英文信息推文。

Soc Netw Anal Min. 2023;13(1):25. doi: 10.1007/s13278-023-01025-8. Epub 2023 Jan 17.

An overview of biomedical entity linking throughout the years.生物医学实体链接概述。

J Biomed Inform. 2023 Jan;137:104252. doi: 10.1016/j.jbi.2022.104252. Epub 2022 Dec 2.

Search Term Identification Methods for Computational Health Communication: Word Embedding and Network Approach for Health Content on YouTube.计算健康传播中的搜索词识别方法：YouTube上健康内容的词嵌入与网络方法

JMIR Med Inform. 2022 Aug 30;10(8):e37862. doi: 10.2196/37862.

Auto Response Generation in Online Medical Chat Services.在线医疗聊天服务中的自动回复生成

J Healthc Inform Res. 2022 Jul 15;6(3):344-374. doi: 10.1007/s41666-022-00118-x. eCollection 2022 Sep.

Increasing Women's Knowledge about HPV Using BERT Text Summarization: An Online Randomized Study.利用 BERT 文本摘要提高女性对 HPV 的认知：一项在线随机研究。

Int J Environ Res Public Health. 2022 Jul 1;19(13):8100. doi: 10.3390/ijerph19138100.

The h-ANN Model: Comprehensive Colonoscopy Concept Compilation Using Combined Contextual Embeddings.h-ANN模型：使用组合上下文嵌入的结肠镜检查综合概念汇编。

Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap. 2022 Feb;5:189-200. doi: 10.5220/0010903300003123.

Identification of asthma control factor in clinical notes using a hybrid deep learning model.使用混合深度学习模型从临床记录中识别哮喘控制因素。

BMC Med Inform Decis Mak. 2021 Nov 9;21(Suppl 7):272. doi: 10.1186/s12911-021-01633-4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

BertMCN：使用 BERT 和高速公路网络将俗语映射到标准医学概念。

BertMCN: Mapping colloquial phrases to standard medical concepts using BERT and highway network.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献