基于词干级特征和自注意力机制的中文临床命名实体识别。

Chinese clinical named entity recognition with radical-level feature and self-attention mechanism.

机构信息

School of Data and Computer Science, Guangdong Province Key Lab of Computational Science, Sun Yat-Sen University, Guangzhou, Guangdong 510006, PR China.

出版信息

J Biomed Inform. 2019 Oct;98:103289. doi: 10.1016/j.jbi.2019.103289. Epub 2019 Sep 18.

DOI:10.1016/j.jbi.2019.103289

PMID:31541715

Abstract

Named entity recognition is a fundamental and crucial task in medical natural language processing problems. In medical fields, Chinese clinical named entity recognition identifies boundaries and types of medical entities from unstructured text such as electronic medical records. Recently, a composition model of bidirectional Long Short-term Memory Networks (BiLSTMs) and conditional random field (BiLSTM-CRF) based character-level semantics has achieved great success in Chinese clinical named entity recognition tasks. But this method can only capture contextual semantics between characters in sentences. However, Chinese characters are hieroglyphics, and deeper semantic information is hidden inside, the BiLSTM-CRF model failed to get this information. In addition, some of the entities in the sentence are dependent, but the Long Short-term Memory (LSTM) does not capture long-term dependencies perfectly between characters. So we propose a BiLSTM-CRF model based on the radical-level feature and self-attention mechanism to solve these problems. We use the convolutional neural network (CNN) to extract radical-level features, aims to capture the intrinsic and internal relevances of characters. In addition, we use self-attention mechanism to capture the dependency between characters regardless of their distance. Experiments show that our model achieves F1-score 93.00% and 86.34% on CCKS-2017 and TP_CNER dataset respectively.

摘要

命名实体识别是医学自然语言处理问题中的一项基本且关键的任务。在医学领域，中文临床命名实体识别从电子病历等非结构化文本中识别医学实体的边界和类型。最近，一种基于字符级语义的双向长短时记忆网络（BiLSTM）和条件随机场（BiLSTM-CRF）的组合模型在中文临床命名实体识别任务中取得了巨大成功。但是，这种方法只能捕获句子中字符之间的上下文语义。然而，汉字是表意文字，内部隐藏着更深层次的语义信息，BiLSTM-CRF 模型无法获取这些信息。此外，句子中的一些实体是相互依赖的，但长短时记忆（LSTM）无法完美地捕捉字符之间的长期依赖关系。因此，我们提出了一种基于部首级特征和自注意力机制的 BiLSTM-CRF 模型来解决这些问题。我们使用卷积神经网络（CNN）提取部首级特征，旨在捕获字符的内在和内部相关性。此外，我们使用自注意力机制来捕获字符之间的依赖关系，而不受其距离的影响。实验表明，我们的模型在 CCKS-2017 和 TP_CNER 数据集上分别实现了 93.00%和 86.34%的 F1 得分。

相似文献

Chinese clinical named entity recognition with radical-level feature and self-attention mechanism.

J Biomed Inform. 2019 Oct;98:103289. doi: 10.1016/j.jbi.2019.103289. Epub 2019 Sep 18.

An attention-based deep learning model for clinical named entity recognition of Chinese electronic medical records.

BMC Med Inform Decis Mak. 2019 Dec 5;19(Suppl 5):235. doi: 10.1186/s12911-019-0933-6.

Chinese clinical named entity recognition via multi-head self-attention based BiLSTM-CRF.

Artif Intell Med. 2022 May;127:102282. doi: 10.1016/j.artmed.2022.102282. Epub 2022 Mar 18.

A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records.

BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):65. doi: 10.1186/s12911-019-0762-7.

Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF.

BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):74. doi: 10.1186/s12911-019-0787-y.

Chinese Clinical Named Entity Recognition Using Residual Dilated Convolutional Neural Network With Conditional Random Field.

IEEE Trans Nanobioscience. 2019 Jul;18(3):306-315. doi: 10.1109/TNB.2019.2908678. Epub 2019 Apr 1.

Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.

BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.

Entity recognition of Chinese medical text based on multi-head self-attention combined with BILSTM-CRF.

Math Biosci Eng. 2022 Jan 4;19(3):2206-2218. doi: 10.3934/mbe.2022103.

Named entity recognition from Chinese adverse drug event reports with lexical feature based BiLSTM-CRF and tri-training.

J Biomed Inform. 2019 Aug;96:103252. doi: 10.1016/j.jbi.2019.103252. Epub 2019 Jul 16.

Chinese clinical named entity recognition with variant neural structures based on BERT methods.

J Biomed Inform. 2020 Jul;107:103422. doi: 10.1016/j.jbi.2020.103422. Epub 2020 Apr 28.

引用本文的文献

Early prediction of colorectal adenoma risk: leveraging large-language model for clinical electronic medical record data.

Front Oncol. 2025 May 15;15:1508455. doi: 10.3389/fonc.2025.1508455. eCollection 2025.

Chinese Named Entity Recognition for Dairy Cow Diseases by Fusion of Multi-Semantic Features Using Self-Attention-Based Deep Learning.

Animals (Basel). 2025 Mar 13;15(6):822. doi: 10.3390/ani15060822.

Named Entity Recognition in Electronic Health Records: A Methodological Review.

Healthc Inform Res. 2023 Oct;29(4):286-300. doi: 10.4258/hir.2023.29.4.286. Epub 2023 Oct 31.

Application of Entity-BERT model based on neuroscience and brain-like cognition in electronic medical record entity recognition.

Front Neurosci. 2023 Sep 20;17:1259652. doi: 10.3389/fnins.2023.1259652. eCollection 2023.

A BERT-Span model for Chinese named entity recognition in rehabilitation medicine.

PeerJ Comput Sci. 2023 Aug 21;9:e1535. doi: 10.7717/peerj-cs.1535. eCollection 2023.

Automatic knowledge extraction from Chinese electronic medical records and rheumatoid arthritis knowledge graph construction.

Quant Imaging Med Surg. 2023 Jun 1;13(6):3873-3890. doi: 10.21037/qims-22-1158. Epub 2023 May 8.

Chinese Clinical Named Entity Recognition From Electronic Medical Records Based on Multisemantic Features by Using Robustly Optimized Bidirectional Encoder Representation From Transformers Pretraining Approach Whole Word Masking and Convolutional Neural Networks: Model Development and Validation.

JMIR Med Inform. 2023 May 10;11:e44597. doi: 10.2196/44597.

Artificial Intelligence-Based Prediction of Lower Extremity Deep Vein Thrombosis Risk After Knee/Hip Arthroplasty.

Clin Appl Thromb Hemost. 2023 Jan-Dec;29:10760296221139263. doi: 10.1177/10760296221139263.

Model-based clinical note entity recognition for rheumatoid arthritis using bidirectional encoder representation from transformers.

Quant Imaging Med Surg. 2022 Jan;12(1):184-195. doi: 10.21037/qims-21-90.

TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition.

J Healthc Eng. 2021 Aug 7;2021:3544281. doi: 10.1155/2021/3544281. eCollection 2021.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于词干级特征和自注意力机制的中文临床命名实体识别。

Chinese clinical named entity recognition with radical-level feature and self-attention mechanism.

机构信息

School of Data and Computer Science, Guangdong Province Key Lab of Computational Science, Sun Yat-Sen University, Guangzhou, Guangdong 510006, PR China.

出版信息

J Biomed Inform. 2019 Oct;98:103289. doi: 10.1016/j.jbi.2019.103289. Epub 2019 Sep 18.

DOI:10.1016/j.jbi.2019.103289

PMID:31541715

Abstract

摘要

基于词干级特征和自注意力机制的中文临床命名实体识别。

Chinese clinical named entity recognition with radical-level feature and self-attention mechanism.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于词干级特征和自注意力机制的中文临床命名实体识别。

Chinese clinical named entity recognition with radical-level feature and self-attention mechanism.

机构信息

出版信息

相似文献

引用本文的文献