基于分层标签注意力转换器模型的可解释 ICD 编码。

Hierarchical label-wise attention transformer model for explainable ICD coding.

机构信息

Centre for Big Data Research in Health, University of New South Wales, Sydney, Australia.

出版信息

J Biomed Inform. 2022 Sep;133:104161. doi: 10.1016/j.jbi.2022.104161. Epub 2022 Aug 20.

DOI:10.1016/j.jbi.2022.104161

Abstract

International Classification of Diseases (ICD) coding plays an important role in systematically classifying morbidity and mortality data. In this study, we propose a hierarchical label-wise attention Transformer model (HiLAT) for the explainable prediction of ICD codes from clinical documents. HiLAT firstly fine-tunes a pretrained Transformer model to represent the tokens of clinical documents. We subsequently employ a two-level hierarchical label-wise attention mechanism that creates label-specific document representations. These representations are in turn used by a feed-forward neural network to predict whether a specific ICD code is assigned to the input clinical document of interest. We evaluate HiLAT using hospital discharge summaries and their corresponding ICD-9 codes from the MIMIC-III database. To investigate the performance of different types of Transformer models, we develop ClinicalplusXLNet, which conducts continual pretraining from XLNet-Base using all the MIMIC-III clinical notes. The experiment results show that the F1 scores of the HiLAT + ClinicalplusXLNet outperform the previous state-of-the-art models for the top-50 most frequent ICD-9 codes from MIMIC-III. Visualisations of attention weights present a potential explainability tool for checking the face validity of ICD code predictions.

摘要

国际疾病分类（ICD）编码在系统分类发病率和死亡率数据方面发挥着重要作用。在这项研究中，我们提出了一种分层标签式注意力转换器模型（HiLAT），用于从临床文档中可解释地预测 ICD 编码。HiLAT 首先微调一个预先训练的转换器模型来表示临床文档的标记。然后，我们采用两级分层标签式注意力机制来创建特定标签的文档表示。这些表示随后由前馈神经网络使用，以预测特定的 ICD 代码是否分配给感兴趣的输入临床文档。我们使用来自 MIMIC-III 数据库的住院小结及其相应的 ICD-9 代码来评估 HiLAT。为了研究不同类型的转换器模型的性能，我们开发了 ClinicalplusXLNet，它使用所有的 MIMIC-III 临床记录对 XLNet-Base 进行持续预训练。实验结果表明，HiLAT + ClinicalplusXLNet 的 F1 分数优于之前的最先进模型，用于 MIMIC-III 中前 50 个最常见的 ICD-9 代码。注意力权重的可视化提供了一种潜在的可解释性工具，用于检查 ICD 编码预测的表面有效性。

相似文献

Hierarchical label-wise attention transformer model for explainable ICD coding.基于分层标签注意力转换器模型的可解释 ICD 编码。

J Biomed Inform. 2022 Sep;133:104161. doi: 10.1016/j.jbi.2022.104161. Epub 2022 Aug 20.

Explainable automated coding of clinical notes using hierarchical label-wise attention networks and label embedding initialisation.使用分层标签分类注意力网络和标签嵌入初始化来实现临床笔记的可解释自动化编码。

J Biomed Inform. 2021 Apr;116:103728. doi: 10.1016/j.jbi.2021.103728. Epub 2021 Mar 9.

Automated ICD coding using extreme multi-label long text transformer-based models.基于极端多标签长文本转换器的自动 ICD 编码。

Artif Intell Med. 2023 Oct;144:102662. doi: 10.1016/j.artmed.2023.102662. Epub 2023 Sep 7.

An explainable CNN approach for medical codes prediction from clinical text.一种用于从临床文本预测医疗编码的可解释 CNN 方法。

BMC Med Inform Decis Mak. 2021 Nov 16;21(Suppl 9):256. doi: 10.1186/s12911-021-01615-6.

A Pseudo Label-Wise Attention Network for Automatic ICD Coding.基于伪标签注意力网络的 ICD 自动编码方法。

IEEE J Biomed Health Inform. 2022 Oct;26(10):5201-5212. doi: 10.1109/JBHI.2022.3193291. Epub 2022 Oct 5.

Creating a computer assisted ICD coding system: Performance metric choice and use of the ICD hierarchy.创建计算机辅助 ICD 编码系统：性能指标的选择和 ICD 层次结构的使用。

J Biomed Inform. 2024 Apr;152:104617. doi: 10.1016/j.jbi.2024.104617. Epub 2024 Mar 1.

Automatic International Classification of Diseases Coding System: Deep Contextualized Language Model With Rule-Based Approaches.自动国际疾病分类编码系统：基于规则方法的深度情境化语言模型

JMIR Med Inform. 2022 Jun 29;10(6):e37557. doi: 10.2196/37557.

Can GPT-3.5 generate and code discharge summaries?GPT-3.5 可以生成和编写出院小结吗？

J Am Med Inform Assoc. 2024 Oct 1;31(10):2284-2293. doi: 10.1093/jamia/ocae132.

EHR coding with hybrid attention and features propagation on disease knowledge graph.基于疾病知识图谱的混合注意力与特征传播的电子病历编码。

Artif Intell Med. 2024 Aug;154:102916. doi: 10.1016/j.artmed.2024.102916. Epub 2024 Jun 18.

ICDXML: enhancing ICD coding with probabilistic label trees and dynamic semantic representations.ICDXML：利用概率标签树和动态语义表示增强 ICD 编码。

Sci Rep. 2024 Aug 7;14(1):18319. doi: 10.1038/s41598-024-69214-9.

引用本文的文献

Fine-grained Patient Similarity Measuring using Contrastive Graph Similarity Networks.使用对比图相似性网络的细粒度患者相似性测量

Proc (IEEE Int Conf Healthc Inform). 2024 Jun;2024:1-10. doi: 10.1109/ichi61247.2024.00009. Epub 2024 Aug 22.

Explainable text-tabular models for predicting mortality risk in companion animals.用于预测伴侣动物死亡风险的可解释文本-表格模型。

Sci Rep. 2024 Jun 20;14(1):14217. doi: 10.1038/s41598-024-64551-1.

Leveraging Unlabeled Clinical Data to Boost Performance of Risk Stratification Models for Suspected Acute Coronary Syndrome.利用未标记的临床数据提高疑似急性冠状动脉综合征风险分层模型的性能。

AMIA Annu Symp Proc. 2024 Jan 11;2023:744-753. eCollection 2023.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于分层标签注意力转换器模型的可解释 ICD 编码。

Hierarchical label-wise attention transformer model for explainable ICD coding.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献