• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于堆叠 BERT 的中文医疗过程实体标准化模型。

Stacking-BERT model for Chinese medical procedure entity normalization.

机构信息

Institute of Medical Information, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China.

National Engineering Laboratory for Internet Medical Systems and Applications, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.

出版信息

Math Biosci Eng. 2023 Jan;20(1):1018-1036. doi: 10.3934/mbe.2023047. Epub 2022 Oct 24.

DOI:10.3934/mbe.2023047
PMID:36650800
Abstract

Medical procedure entity normalization is an important task to realize medical information sharing at the semantic level; it faces main challenges such as variety and similarity in real-world practice. Although deep learning-based methods have been successfully applied to biomedical entity normalization, they often depend on traditional context-independent word embeddings, and there is minimal research on medical entity recognition in Chinese Regarding the entity normalization task as a sentence pair classification task, we applied a three-step framework to normalize Chinese medical procedure terms, and it consists of dataset construction, candidate concept generation and candidate concept ranking. For dataset construction, external knowledge base and easy data augmentation skills were used to increase the diversity of training samples. For candidate concept generation, we implemented the BM25 retrieval method based on integrating synonym knowledge of SNOMED CT and train data. For candidate concept ranking, we designed a stacking-BERT model, including the original BERT-based and Siamese-BERT ranking models, to capture the semantic information and choose the optimal mapping pairs by the stacking mechanism. In the training process, we also added the tricks of adversarial training to improve the learning ability of the model on small-scale training data. Based on the clinical entity normalization task dataset of the 5th China Health Information Processing Conference, our stacking-BERT model achieved an accuracy of 93.1%, which outperformed the single BERT models and other traditional deep learning models. In conclusion, this paper presents an effective method for Chinese medical procedure entity normalization and validation of different BERT-based models. In addition, we found that the tricks of adversarial training and data augmentation can effectively improve the effect of the deep learning model for small samples, which might provide some useful ideas for future research.

摘要

医学过程实体规范化是实现语义级医学信息共享的重要任务;它面临着来自真实世界实践中的多样性和相似性等主要挑战。虽然基于深度学习的方法已成功应用于生物医学实体规范化,但它们通常依赖于传统的与上下文无关的词嵌入,并且针对中文医学实体识别的研究很少。我们将实体规范化任务视为句子对分类任务,应用了一个三步框架来规范化中文医学过程术语,该框架包括数据集构建、候选概念生成和候选概念排序。在数据集构建方面,我们使用外部知识库和易于扩充的技巧来增加训练样本的多样性。在候选概念生成方面,我们实现了基于集成 SNOMED CT 同义词知识和训练数据的 BM25 检索方法。在候选概念排序方面,我们设计了一个堆叠式 BERT 模型,包括基于原始 BERT 的和孪生 BERT 排序模型,通过堆叠机制捕捉语义信息并选择最佳映射对。在训练过程中,我们还添加了对抗训练的技巧,以提高模型对小规模训练数据的学习能力。基于第五届中国健康信息处理大会的临床实体规范化任务数据集,我们的堆叠式 BERT 模型在准确性方面达到了 93.1%,优于单 BERT 模型和其他传统深度学习模型。总之,本文提出了一种有效的中文医学过程实体规范化方法,并对不同的 BERT 模型进行了验证。此外,我们发现对抗训练和数据扩充技巧可以有效地提高深度学习模型对小样本的效果,这可能为未来的研究提供一些有用的思路。

相似文献

1
Stacking-BERT model for Chinese medical procedure entity normalization.基于堆叠 BERT 的中文医疗过程实体标准化模型。
Math Biosci Eng. 2023 Jan;20(1):1018-1036. doi: 10.3934/mbe.2023047. Epub 2022 Oct 24.
2
A study of entity-linking methods for normalizing Chinese diagnosis and procedure terms to ICD codes.一项关于将中文诊断和手术术语标准化为ICD编码的实体链接方法的研究。
J Biomed Inform. 2020 May;105:103418. doi: 10.1016/j.jbi.2020.103418. Epub 2020 Apr 13.
3
BERT-based Ranking for Biomedical Entity Normalization.基于BERT的生物医学实体规范化排序
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:269-277. eCollection 2020.
4
Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)-Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study.基于大规模电子健康记录笔记对基于变换器的双向编码器表征(BERT)模型进行微调:一项实证研究。
JMIR Med Inform. 2019 Sep 12;7(3):e14830. doi: 10.2196/14830.
5
SiBERT: A Siamese-based BERT network for Chinese medical entities alignment.SiBERT:一种基于 Siamese 的 BERT 网络,用于中文医疗实体对齐。
Methods. 2022 Sep;205:133-139. doi: 10.1016/j.ymeth.2022.07.003. Epub 2022 Jul 4.
6
Application of Entity-BERT model based on neuroscience and brain-like cognition in electronic medical record entity recognition.基于神经科学和类脑认知的实体BERT模型在电子病历实体识别中的应用
Front Neurosci. 2023 Sep 20;17:1259652. doi: 10.3389/fnins.2023.1259652. eCollection 2023.
7
Named entity recognition of Chinese electronic medical records based on a hybrid neural network and medical MC-BERT.基于混合神经网络和医学 MC-BERT 的中文电子病历命名实体识别。
BMC Med Inform Decis Mak. 2022 Dec 1;22(1):315. doi: 10.1186/s12911-022-02059-2.
8
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。
BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.
9
Multi-Level Representation Learning for Chinese Medical Entity Recognition: Model Development and Validation.用于中文医学实体识别的多层次表示学习:模型开发与验证
JMIR Med Inform. 2020 May 4;8(5):e17637. doi: 10.2196/17637.
10
Analyzing transfer learning impact in biomedical cross-lingual named entity recognition and normalization.分析迁移学习在生物医学跨语言命名实体识别和标准化中的影响。
BMC Bioinformatics. 2021 Dec 17;22(Suppl 1):601. doi: 10.1186/s12859-021-04247-9.

引用本文的文献

1
Use of SNOMED CT in Large Language Models: Scoping Review.SNOMED CT 在大语言模型中的应用:范围综述。
JMIR Med Inform. 2024 Oct 7;12:e62924. doi: 10.2196/62924.