• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BERT 的魔力是否适用于医疗编码分配?一项定量研究。

Does the magic of BERT apply to medical code assignment? A quantitative study.

机构信息

Department of Computer Science, Aalto University, Espoo, 00076, Finland.

出版信息

Comput Biol Med. 2021 Dec;139:104998. doi: 10.1016/j.compbiomed.2021.104998. Epub 2021 Oct 30.

DOI:10.1016/j.compbiomed.2021.104998
PMID:34739971
Abstract

Unsupervised pretraining is an integral part of many natural language processing systems, and transfer learning with language models has achieved remarkable results in downstream tasks. In the clinical application of medical code assignment, diagnosis and procedure codes are inferred from lengthy clinical notes such as hospital discharge summaries. However, it is not clear if pretrained models are useful for medical code prediction without further architecture engineering. This paper conducts a comprehensive quantitative analysis of various contextualized language models' performances, pretrained in different domains, for medical code assignment from clinical notes. We propose a hierarchical fine-tuning architecture to capture interactions between distant words and adopt label-wise attention to exploit label information. Contrary to current trends, we demonstrate that a carefully trained classical CNN outperforms attention-based models on a MIMIC-III subset with frequent codes. Our empirical findings suggest directions for building robust medical code assignment models.

摘要

无监督预训练是许多自然语言处理系统的一个组成部分,语言模型的迁移学习在下游任务中取得了显著的成果。在医疗编码分配的临床应用中,诊断和程序代码是从医院出院总结等冗长的临床记录中推断出来的。然而,在没有进一步进行架构工程的情况下,预训练模型对于医疗编码预测是否有用尚不清楚。本文对不同领域预训练的各种上下文语言模型在从临床记录中分配医疗代码方面的性能进行了全面的定量分析。我们提出了一种分层微调架构,以捕获远距离单词之间的交互,并采用标签分类注意力机制来利用标签信息。与当前的趋势相反,我们证明了在包含频繁代码的 MIMIC-III 子集中,经过精心训练的经典 CNN 优于基于注意力的模型。我们的实证研究结果为构建稳健的医疗编码分配模型提供了方向。

相似文献

1
Does the magic of BERT apply to medical code assignment? A quantitative study.BERT 的魔力是否适用于医疗编码分配?一项定量研究。
Comput Biol Med. 2021 Dec;139:104998. doi: 10.1016/j.compbiomed.2021.104998. Epub 2021 Oct 30.
2
Explainable automated coding of clinical notes using hierarchical label-wise attention networks and label embedding initialisation.使用分层标签分类注意力网络和标签嵌入初始化来实现临床笔记的可解释自动化编码。
J Biomed Inform. 2021 Apr;116:103728. doi: 10.1016/j.jbi.2021.103728. Epub 2021 Mar 9.
3
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
4
Comparison of different feature extraction methods for applicable automated ICD coding.不同特征提取方法在适用的自动化 ICD 编码中的比较。
BMC Med Inform Decis Mak. 2022 Jan 12;22(1):11. doi: 10.1186/s12911-022-01753-5.
5
When BERT meets Bilbo: a learning curve analysis of pretrained language model on disease classification.当 BERT 遇见比尔博:预训练语言模型在疾病分类上的学习曲线分析。
BMC Med Inform Decis Mak. 2022 Apr 5;21(Suppl 9):377. doi: 10.1186/s12911-022-01829-2.
6
Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques.基于深度学习技术的一刀切分类器在临床缩写中的应用。
Methods Inf Med. 2022 Jun;61(S 01):e28-e34. doi: 10.1055/s-0042-1742388. Epub 2022 Feb 1.
7
Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.医学BERT:基于大规模结构化电子健康记录进行疾病预测的预训练上下文嵌入模型
NPJ Digit Med. 2021 May 20;4(1):86. doi: 10.1038/s41746-021-00455-y.
8
Unlocking the Secrets Behind Advanced Artificial Intelligence Language Models in Deidentifying Chinese-English Mixed Clinical Text: Development and Validation Study.揭开高级人工智能语言模型在去识别汉英混合临床文本背后的秘密:开发与验证研究。
J Med Internet Res. 2024 Jan 25;26:e48443. doi: 10.2196/48443.
9
Symptom-BERT: Enhancing Cancer Symptom Detection in EHR Clinical Notes.症状-BERT:增强电子健康记录临床记录中的癌症症状检测
J Pain Symptom Manage. 2024 Aug;68(2):190-198.e1. doi: 10.1016/j.jpainsymman.2024.05.015. Epub 2024 May 23.
10
Relation Classification for Bleeding Events From Electronic Health Records Using Deep Learning Systems: An Empirical Study.使用深度学习系统对电子健康记录中的出血事件进行关系分类:一项实证研究。
JMIR Med Inform. 2021 Jul 2;9(7):e27527. doi: 10.2196/27527.

引用本文的文献

1
Multi-Modal Fusion of Routine Care Electronic Health Records (EHR): A Scoping Review.常规护理电子健康记录(EHR)的多模态融合:一项范围综述
Information (Basel). 2025 Jan;16(1). doi: 10.3390/info16010054. Epub 2025 Jan 15.
2
Leveraging BERT for embedding ICD codes from large scale cardiovascular EMR data to understand patient diagnostic patterns.利用BERT从大规模心血管电子病历数据中嵌入ICD编码,以了解患者的诊断模式。
BMC Med Inform Decis Mak. 2025 Aug 11;25(1):300. doi: 10.1186/s12911-025-03145-x.
3
BERT-DomainAFP: Antifreeze protein recognition and classification model based on BERT and structural domain annotation.
BERT-DomainAFP:基于BERT和结构域注释的抗冻蛋白识别与分类模型
iScience. 2025 Mar 6;28(4):112077. doi: 10.1016/j.isci.2025.112077. eCollection 2025 Apr 18.
4
MISTIC: a novel approach for metastasis classification in Italian electronic health records using transformers.MISTIC:一种使用变压器对意大利电子健康记录中的转移进行分类的新方法。
BMC Med Inform Decis Mak. 2025 Apr 10;25(1):160. doi: 10.1186/s12911-025-02994-w.
5
Predicting ICU Readmission from Electronic Health Records via BERTopic with Long Short Term Memory Network Approach.通过带有长短期记忆网络方法的BERTopic从电子健康记录预测重症监护病房再入院情况。
J Clin Med. 2024 Sep 18;13(18):5503. doi: 10.3390/jcm13185503.
6
Dementia risk prediction using decision-focused content selection from medical notes.使用医疗记录中的决策焦点内容选择来预测痴呆症风险。
Comput Biol Med. 2024 Nov;182:109144. doi: 10.1016/j.compbiomed.2024.109144. Epub 2024 Sep 18.
7
Traditional Machine Learning, Deep Learning, and BERT (Large Language Model) Approaches for Predicting Hospitalizations From Nurse Triage Notes: Comparative Evaluation of Resource Management.用于根据护士分诊记录预测住院情况的传统机器学习、深度学习和BERT(大语言模型)方法:资源管理的比较评估
JMIR AI. 2024 Aug 27;3:e52190. doi: 10.2196/52190.
8
AnEMIC: A Framework for Benchmarking ICD Coding Models.贫血:一种用于对ICD编码模型进行基准测试的框架。
Proc Conf Empir Methods Nat Lang Process. 2022 Dec;2022(SD):109-120. doi: 10.18653/v1/2022.emnlp-demos.11.
9
Algorithmic Identification of Treatment-Emergent Adverse Events From Clinical Notes Using Large Language Models: A Pilot Study in Inflammatory Bowel Disease.利用大型语言模型从临床记录中算法识别治疗相关不良事件:炎症性肠病的初步研究。
Clin Pharmacol Ther. 2024 Jun;115(6):1391-1399. doi: 10.1002/cpt.3226. Epub 2024 Mar 8.
10
A two-stream deep model for automated ICD-9 code prediction in an intensive care unit.一种用于重症监护病房自动预测ICD - 9编码的双流深度模型。
Heliyon. 2024 Feb 8;10(4):e25960. doi: 10.1016/j.heliyon.2024.e25960. eCollection 2024 Feb 29.