• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

融合对抗训练与特征增强的中医命名实体识别

Chinese medical named entity recognition integrating adversarial training and feature enhancement.

作者信息

Zhang Xu, Kao Youchen, Che Shengbing, Yan Juan, Zhou Sha, Guo Shenyi, Wang Wanqin

机构信息

College of Computer Science and Mathematics, Central South University of Forestry and Technology, No.498 Shaoshan South Road, Wenyuan Street, Changsha, 410004, Hunan, China.

Information and Engineering College, Swan College, Central South University of Forestry and Technology, No.1-10 Furong North Road, Changsha, 410211, Hunan, China.

出版信息

Sci Rep. 2025 Apr 28;15(1):14844. doi: 10.1038/s41598-025-98465-3.

DOI:10.1038/s41598-025-98465-3
PMID:40295595
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12037839/
Abstract

Chinese possesses the essential attributes of unique character composition structure and the nested nature of medical entities, which causes many challenges for Chinese Electronic Health Records (EHRs) in medical named entity recognition tasks, such as scarce annotated data, strong tokenization ambiguity, and blurred entity boundaries. This increases the difficulty of extracting medical named entity categories. The paper proposes an effective Chinese clinical named entity recognition model that integrates BERT and adversarial enhancement in a dual channel architecture to address this issue. Firstly, the model integrates various advanced technologies, such as Bidirectional Long Short-Term Memory networks (BiLSTM), Iterative Deep Convolutional Neural Networks (IDCNN), and Conditional Random Fields (CRF), to improve the accuracy of named entity recognition. Secondly, the paper collected texts from medical record websites and utilized the YEDDA tool for professional annotation and processing of these texts, ultimately forming a more comprehensive target dataset. This process ensures that the model is exposed to representative Chinese clinical data during training, thereby improving recognition performance.Finally, experimental results indicate that the BPBIC model achieved a precision of 93.80%, a recall of 94.44%, and an F1 score of 94.12% on the augmented dataset CCKS2019 (CCKS2019+). Moreover, through knowledge graph analysis of medical entities extracted from single and multiple disease EHRs, the model assists doctors in achieving rapid and accurate diagnoses, thereby enhancing the efficiency of healthcare professionals.

摘要

中文具有独特的字符构成结构和医学实体的嵌套性质等本质属性,这给中文电子健康记录(EHR)在医学命名实体识别任务中带来了诸多挑战,比如标注数据稀缺、分词歧义性强以及实体边界模糊。这增加了提取医学命名实体类别的难度。本文提出了一种有效的中文临床命名实体识别模型,该模型在双通道架构中集成了BERT和对抗增强技术来解决这一问题。首先,该模型集成了各种先进技术,如双向长短期记忆网络(BiLSTM)、迭代深度卷积神经网络(IDCNN)和条件随机场(CRF),以提高命名实体识别的准确性。其次,本文从病历网站收集文本,并利用YEDDA工具对这些文本进行专业标注和处理,最终形成了一个更全面的目标数据集。这一过程确保模型在训练期间接触到具有代表性的中文临床数据,从而提高识别性能。最后,实验结果表明,BPBIC模型在增强数据集CCKS2019(CCKS2019+)上的精确率为93.80%,召回率为94.44%,F1分数为94.12%。此外,通过对从单病和多病EHR中提取的医学实体进行知识图谱分析,该模型协助医生实现快速准确的诊断,从而提高医疗专业人员的效率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/c54bf370b775/41598_2025_98465_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/3851355d2fa7/41598_2025_98465_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/2bed1dc4d669/41598_2025_98465_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/1cb6055bd2aa/41598_2025_98465_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/36a5418da1d4/41598_2025_98465_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/8d8bb205a7c3/41598_2025_98465_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/683646e44ee5/41598_2025_98465_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/9afed4927241/41598_2025_98465_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/eeea4ea184ab/41598_2025_98465_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/85953570ad07/41598_2025_98465_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/ecf22d2120d3/41598_2025_98465_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/c7832af1376a/41598_2025_98465_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/5e9b8d035e3b/41598_2025_98465_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/e3084c9a02cd/41598_2025_98465_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/05fca74898af/41598_2025_98465_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/3ae2d58e3449/41598_2025_98465_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/c54bf370b775/41598_2025_98465_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/3851355d2fa7/41598_2025_98465_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/2bed1dc4d669/41598_2025_98465_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/1cb6055bd2aa/41598_2025_98465_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/36a5418da1d4/41598_2025_98465_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/8d8bb205a7c3/41598_2025_98465_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/683646e44ee5/41598_2025_98465_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/9afed4927241/41598_2025_98465_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/eeea4ea184ab/41598_2025_98465_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/85953570ad07/41598_2025_98465_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/ecf22d2120d3/41598_2025_98465_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/c7832af1376a/41598_2025_98465_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/5e9b8d035e3b/41598_2025_98465_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/e3084c9a02cd/41598_2025_98465_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/05fca74898af/41598_2025_98465_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/3ae2d58e3449/41598_2025_98465_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f47d/12037839/c54bf370b775/41598_2025_98465_Fig14_HTML.jpg

相似文献

1
Chinese medical named entity recognition integrating adversarial training and feature enhancement.融合对抗训练与特征增强的中医命名实体识别
Sci Rep. 2025 Apr 28;15(1):14844. doi: 10.1038/s41598-025-98465-3.
2
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。
BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.
3
Chinese Clinical Named Entity Recognition From Electronic Medical Records Based on Multisemantic Features by Using Robustly Optimized Bidirectional Encoder Representation From Transformers Pretraining Approach Whole Word Masking and Convolutional Neural Networks: Model Development and Validation.基于多语义特征,利用经过稳健优化的基于变换器预训练方法的全词掩码和卷积神经网络从电子病历中进行中文临床命名实体识别:模型开发与验证
JMIR Med Inform. 2023 May 10;11:e44597. doi: 10.2196/44597.
4
A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records.基于词性和自匹配注意力的深度学习模型在中文电子病历命名实体识别中的应用。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):65. doi: 10.1186/s12911-019-0762-7.
5
MF-MNER: Multi-models Fusion for MNER in Chinese Clinical Electronic Medical Records.MF-MNER:中文临床电子病历中的多模型融合命名实体识别。
Interdiscip Sci. 2024 Jun;16(2):489-502. doi: 10.1007/s12539-024-00624-z. Epub 2024 Apr 5.
6
Chinese Clinical Named Entity Recognition With Segmentation Synonym Sentence Synthesis Mechanism: Algorithm Development and Validation.基于分词、同义词和句子合成机制的中文临床命名实体识别:算法开发与验证
JMIR Med Inform. 2024 Nov 21;12:e60334. doi: 10.2196/60334.
7
Named entity recognition of Chinese electronic medical records based on a hybrid neural network and medical MC-BERT.基于混合神经网络和医学 MC-BERT 的中文电子病历命名实体识别。
BMC Med Inform Decis Mak. 2022 Dec 1;22(1):315. doi: 10.1186/s12911-022-02059-2.
8
Clinical Named Entity Recognition From Chinese Electronic Health Records via Machine Learning Methods.基于机器学习方法的中文电子健康记录临床命名实体识别
JMIR Med Inform. 2018 Dec 17;6(4):e50. doi: 10.2196/medinform.9965.
9
Chinese clinical named entity recognition with radical-level feature and self-attention mechanism.基于词干级特征和自注意力机制的中文临床命名实体识别。
J Biomed Inform. 2019 Oct;98:103289. doi: 10.1016/j.jbi.2019.103289. Epub 2019 Sep 18.
10
Ontology-Based Healthcare Named Entity Recognition from Twitter Messages Using a Recurrent Neural Network Approach.基于本体的推特消息中医疗命名实体识别的递归神经网络方法。
Int J Environ Res Public Health. 2019 Sep 27;16(19):3628. doi: 10.3390/ijerph16193628.

本文引用的文献

1
A benchmark for automatic medical consultation system: frameworks, tasks and datasets.自动医疗咨询系统的基准测试:框架、任务和数据集。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac817.
2
Named entity recognition of Chinese electronic medical records based on a hybrid neural network and medical MC-BERT.基于混合神经网络和医学 MC-BERT 的中文电子病历命名实体识别。
BMC Med Inform Decis Mak. 2022 Dec 1;22(1):315. doi: 10.1186/s12911-022-02059-2.
3
Multi-level semantic fusion network for Chinese medical named entity recognition.
用于中文医学命名实体识别的多层次语义融合网络
J Biomed Inform. 2022 Sep;133:104144. doi: 10.1016/j.jbi.2022.104144. Epub 2022 Jul 22.
4
Entity recognition of Chinese medical text based on multi-head self-attention combined with BILSTM-CRF.基于多头自注意力机制结合 BiLSTM-CRF 的中文医疗文本实体识别
Math Biosci Eng. 2022 Jan 4;19(3):2206-2218. doi: 10.3934/mbe.2022103.
5
Incorporating multi-level CNN and attention mechanism for Chinese clinical named entity recognition.基于多层 CNN 和注意力机制的中文临床命名实体识别。
J Biomed Inform. 2021 Apr;116:103737. doi: 10.1016/j.jbi.2021.103737. Epub 2021 Mar 15.
6
Chinese Clinical Named Entity Recognition in Electronic Medical Records: Development of a Lattice Long Short-Term Memory Model With Contextualized Character Representations.电子病历中的中文临床命名实体识别:基于上下文特征表示的格长短期记忆模型的开发
JMIR Med Inform. 2020 Sep 4;8(9):e19848. doi: 10.2196/19848.
7
Research on Chinese medical named entity recognition based on collaborative cooperation of multiple neural network models.基于多神经网络模型协同合作的中医命名实体识别研究
J Biomed Inform. 2020 Apr;104:103395. doi: 10.1016/j.jbi.2020.103395. Epub 2020 Feb 25.
8
A hybrid approach for named entity recognition in Chinese electronic medical record.中文电子病历命名实体识别的混合方法。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):64. doi: 10.1186/s12911-019-0767-2.
9
Deep learning with word embeddings improves biomedical named entity recognition.使用词嵌入的深度学习可改善生物医学命名实体识别。
Bioinformatics. 2017 Jul 15;33(14):i37-i48. doi: 10.1093/bioinformatics/btx228.
10
Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network.基于深度神经网络的中文临床文本命名实体识别
Stud Health Technol Inform. 2015;216:624-8.