基于 BERT 方法的中文临床命名实体识别与变体神经结构。

Chinese clinical named entity recognition with variant neural structures based on BERT methods.

机构信息

School of Mathematical Sciences, Peking University, Beijing 100871, China; Center for Statistical Sciences, Peking University, Beijing 100871, China.

Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China.

出版信息

J Biomed Inform. 2020 Jul;107:103422. doi: 10.1016/j.jbi.2020.103422. Epub 2020 Apr 28.

DOI:10.1016/j.jbi.2020.103422

PMID:32353595

Abstract

Clinical Named Entity Recognition (CNER) is a critical task which aims to identify and classify clinical terms in electronic medical records. In recent years, deep neural networks have achieved significant success in CNER. However, these methods require high-quality and large-scale labeled clinical data, which is challenging and expensive to obtain, especially data on Chinese clinical records. To tackle the Chinese CNER task, we pre-train BERT model on the unlabeled Chinese clinical records, which can leverage the unlabeled domain-specific knowledge. Different layers such as Long Short-Term Memory (LSTM) and Conditional Random Field (CRF) are used to extract the text features and decode the predicted tags respectively. In addition, we propose a new strategy to incorporate dictionary features into the model. Radical features of Chinese characters are used to improve the model performance as well. To the best of our knowledge, our ensemble model outperforms the state of the art models which achieves 89.56% strict F1 score on the CCKS-2018 dataset and 91.60% F1 score on CCKS-2017 dataset.

摘要

临床命名实体识别（CNER）是一项关键任务，旨在识别和分类电子病历中的临床术语。近年来，深度学习在 CNER 方面取得了重大成功。然而，这些方法需要高质量和大规模的标记临床数据，这在获取方面具有挑战性和昂贵，尤其是中文临床记录的数据。为了解决中文 CNER 任务，我们在未标记的中文临床记录上预训练 BERT 模型，从而利用未标记的特定领域知识。不同的层，如长短期记忆（LSTM）和条件随机场（CRF），分别用于提取文本特征和解码预测标签。此外，我们提出了一种将字典特征纳入模型的新策略。汉字的部首特征也被用来提高模型性能。据我们所知，我们的集成模型优于最先进的模型，在 CCKS-2018 数据集上实现了 89.56%的严格 F1 得分，在 CCKS-2017 数据集上实现了 91.60%的 F1 得分。

相似文献

Chinese clinical named entity recognition with variant neural structures based on BERT methods.基于 BERT 方法的中文临床命名实体识别与变体神经结构。

J Biomed Inform. 2020 Jul;107:103422. doi: 10.1016/j.jbi.2020.103422. Epub 2020 Apr 28.

Chinese clinical named entity recognition with radical-level feature and self-attention mechanism.基于词干级特征和自注意力机制的中文临床命名实体识别。

J Biomed Inform. 2019 Oct;98:103289. doi: 10.1016/j.jbi.2019.103289. Epub 2019 Sep 18.

Chinese Clinical Named Entity Recognition in Electronic Medical Records: Development of a Lattice Long Short-Term Memory Model With Contextualized Character Representations.电子病历中的中文临床命名实体识别：基于上下文特征表示的格长短期记忆模型的开发

JMIR Med Inform. 2020 Sep 4;8(9):e19848. doi: 10.2196/19848.

Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。

BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.

An attention-based deep learning model for clinical named entity recognition of Chinese electronic medical records.基于注意力的深度学习模型在中文电子病历临床命名实体识别中的应用。

BMC Med Inform Decis Mak. 2019 Dec 5;19(Suppl 5):235. doi: 10.1186/s12911-019-0933-6.

Named entity recognition of Chinese electronic medical records based on a hybrid neural network and medical MC-BERT.基于混合神经网络和医学 MC-BERT 的中文电子病历命名实体识别。

BMC Med Inform Decis Mak. 2022 Dec 1;22(1):315. doi: 10.1186/s12911-022-02059-2.

Adversarial training based lattice LSTM for Chinese clinical named entity recognition.基于对抗训练的格 lattice LSTM 进行中文临床命名实体识别。

J Biomed Inform. 2019 Nov;99:103290. doi: 10.1016/j.jbi.2019.103290. Epub 2019 Sep 23.

Chinese Clinical Named Entity Recognition Using Residual Dilated Convolutional Neural Network With Conditional Random Field.基于条件随机场的残差扩张卷积神经网络的中文临床命名实体识别

IEEE Trans Nanobioscience. 2019 Jul;18(3):306-315. doi: 10.1109/TNB.2019.2908678. Epub 2019 Apr 1.

A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records.基于词性和自匹配注意力的深度学习模型在中文电子病历命名实体识别中的应用。

BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):65. doi: 10.1186/s12911-019-0762-7.

Clinical Named Entity Recognition From Chinese Electronic Health Records via Machine Learning Methods.基于机器学习方法的中文电子健康记录临床命名实体识别

JMIR Med Inform. 2018 Dec 17;6(4):e50. doi: 10.2196/medinform.9965.

引用本文的文献

Comparative Analysis of Large Language Models in Chinese Medical Named Entity Recognition.中文医学命名实体识别中大型语言模型的比较分析

Bioengineering (Basel). 2024 Sep 29;11(10):982. doi: 10.3390/bioengineering11100982.

Construction, evaluation, and application of an electronic medical record corpus for cerebral palsy rehabilitation.用于脑瘫康复的电子病历语料库的构建、评估及应用

Digit Health. 2024 Sep 27;10:20552076241286260. doi: 10.1177/20552076241286260. eCollection 2024 Jan-Dec.

Transformers and large language models in healthcare: A review.医疗保健中的变压器和大型语言模型：综述。

Artif Intell Med. 2024 Aug;154:102900. doi: 10.1016/j.artmed.2024.102900. Epub 2024 Jun 5.

Evolution and emerging trends of named entity recognition: Bibliometric analysis from 2000 to 2023.命名实体识别的发展与新兴趋势：2000年至2023年的文献计量分析

Heliyon. 2024 Apr 22;10(9):e30053. doi: 10.1016/j.heliyon.2024.e30053. eCollection 2024 May 15.

Exploring the Latest Highlights in Medical Natural Language Processing across Multiple Languages: A Survey.探索多语言医学自然语言处理的最新亮点：综述。

Yearb Med Inform. 2023 Aug;32(1):230-243. doi: 10.1055/s-0043-1768726. Epub 2023 Dec 26.

CLART: A cascaded lattice-and-radical transformer network for Chinese medical named entity recognition.CLART：一种用于中文医学命名实体识别的级联格与激进变压器网络。

Heliyon. 2023 Oct 10;9(10):e20692. doi: 10.1016/j.heliyon.2023.e20692. eCollection 2023 Oct.

Application of Entity-BERT model based on neuroscience and brain-like cognition in electronic medical record entity recognition.基于神经科学和类脑认知的实体BERT模型在电子病历实体识别中的应用

Front Neurosci. 2023 Sep 20;17:1259652. doi: 10.3389/fnins.2023.1259652. eCollection 2023.

A BERT-Span model for Chinese named entity recognition in rehabilitation medicine.一种用于康复医学中文命名实体识别的BERT跨度模型。

PeerJ Comput Sci. 2023 Aug 21;9:e1535. doi: 10.7717/peerj-cs.1535. eCollection 2023.

Chat agents respond more empathetically by using hearsay experience.聊天机器人通过运用传闻经验做出更具同理心的回应。

Front Robot AI. 2023 Jul 25;10:960087. doi: 10.3389/frobt.2023.960087. eCollection 2023.

A Joint Extraction System Based on Conditional Layer Normalization for Health Monitoring.基于条件层归一化的健康监测联合提取系统。

Sensors (Basel). 2023 May 16;23(10):4812. doi: 10.3390/s23104812.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于 BERT 方法的中文临床命名实体识别与变体神经结构。

Chinese clinical named entity recognition with variant neural structures based on BERT methods.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献