一种基于深度学习方法的冠心病自动ICD编码

Automated ICD coding for coronary heart diseases by a deep learning method.

作者信息

Zhao Shuai, Diao Xiaolin, Xia Yun, Huo Yanni, Cui Meng, Wang Yuxin, Yuan Jing, Zhao Wei

机构信息

Department of Information Center, Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100037, China.

Medical Record Department, Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100037, China.

出版信息

Heliyon. 2023 Feb 27;9(3):e14037. doi: 10.1016/j.heliyon.2023.e14037. eCollection 2023 Mar.

DOI:10.1016/j.heliyon.2023.e14037

PMID:36938427

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10018467/

Abstract

Automated ICD coding via machine learning that focuses on some specific diseases has been a hot topic. As one of the leading causes of death, coronary heart diseases (CHD) have seldom been specifically studied by related research, probably due to lack of data concretely targeting at the diseases. Based on Fuwai-CHD and MIMIC-III-CHD, which are a private dataset from Fuwai Hospital and the CHD-related subset of a public dataset named MIMIC-III respectively, this study aimed at automated CHD coding by a deep learning method, which mainly consists of three modules. The first is a variant module responsible for encoding clinical text. In the module, we fine-tuned variants with masked language model on clinical text, and proposed a truncation method to tackle the problem that variants generally cannot handle sequences containing more than 512 tokens. The second is a ord2vec module for encoding code titles and the third is a label-ention module for integrating the embeddings of clinical text and code titles. In short, we named the method . We compared against some widely studied baselines, and found that performed best in most of the coding missions. Specifically, reached a 1 of 96.2% and a of 98.9% for the top-100 most frequent codes in Fuwai-CHD, which covered 89.2% of the total code occurrences. When predicting the top-50 most frequent codes in MIMIC-III-CHD, reached a 1 of 40.5% and a of 66.1%. Moreover, was capable of locating informative tokens from clinical text for predicting the target codes. In summary, can not only suggest CHD codes accurately, but also possess robust interpretability, hence has great potential in facilitating CHD coding in practice.

摘要

通过专注于某些特定疾病的机器学习进行自动ICD编码一直是一个热门话题。作为主要死因之一，冠心病（CHD）很少被相关研究专门研究，这可能是由于缺乏针对该疾病的具体数据。基于分别来自阜外医院的私有数据集Fuwai-CHD和名为MIMIC-III的公共数据集的CHD相关子集MIMIC-III-CHD，本研究旨在通过一种深度学习方法进行冠心病自动编码，该方法主要由三个模块组成。第一个是负责编码临床文本的变异模块。在该模块中，我们使用掩码语言模型在临床文本上对变异进行微调，并提出了一种截断方法来解决变异通常无法处理包含超过512个词元的序列的问题。第二个是用于编码代码标题的ord2vec模块，第三个是用于整合临床文本和代码标题嵌入的标签注意力模块。简而言之，我们将该方法命名为。我们将与一些广泛研究的基线进行比较，发现在大多数编码任务中表现最佳。具体而言，对于Fuwai-CHD中最频繁出现的前100个代码，的top-1准确率达到96.2%，top-10准确率达到98.9%，这些代码覆盖了总代码出现次数的89.2%。在预测MIMIC-III-CHD中最频繁出现的前50个代码时，的top-1准确率达到40.5%，top-10准确率达到66.1%。此外，能够从临床文本中定位信息性词元以预测目标代码。总之，不仅可以准确地给出冠心病代码，还具有强大的可解释性，因此在促进冠心病编码实践方面具有巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/87dd/10018467/bf1604c74d6a/gr1.jpg

相似文献

Automated ICD coding for coronary heart diseases by a deep learning method.一种基于深度学习方法的冠心病自动ICD编码

Heliyon. 2023 Feb 27;9(3):e14037. doi: 10.1016/j.heliyon.2023.e14037. eCollection 2023 Mar.

Comparison of different feature extraction methods for applicable automated ICD coding.不同特征提取方法在适用的自动化 ICD 编码中的比较。

BMC Med Inform Decis Mak. 2022 Jan 12;22(1):11. doi: 10.1186/s12911-022-01753-5.

Automatic ICD-10-CM coding via Lambda-Scaled attention based deep learning model.基于 Lambda 缩放注意力的深度学习模型实现自动 ICD-10-CM 编码。

Methods. 2024 Feb;222:19-27. doi: 10.1016/j.ymeth.2023.11.017. Epub 2023 Dec 21.

Explainable automated coding of clinical notes using hierarchical label-wise attention networks and label embedding initialisation.使用分层标签分类注意力网络和标签嵌入初始化来实现临床笔记的可解释自动化编码。

J Biomed Inform. 2021 Apr;116:103728. doi: 10.1016/j.jbi.2021.103728. Epub 2021 Mar 9.

Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。

J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

Automatic International Classification of Diseases Coding System: Deep Contextualized Language Model With Rule-Based Approaches.自动国际疾病分类编码系统：基于规则方法的深度情境化语言模型

JMIR Med Inform. 2022 Jun 29;10(6):e37557. doi: 10.2196/37557.

Automated ICD coding for primary diagnosis via clinically interpretable machine learning.通过具有临床解释能力的机器学习实现主要诊断的自动化 ICD 编码。

Int J Med Inform. 2021 Sep;153:104543. doi: 10.1016/j.ijmedinf.2021.104543. Epub 2021 Jul 27.

DRCNN: A deep recurrent convolutional neural network with transfer learning through pre-trained embeddings for automated ICD coding.深度递归卷积神经网络：通过预训练的嵌入进行迁移学习，实现自动化 ICD 编码

Methods. 2022 Sep;205:97-105. doi: 10.1016/j.ymeth.2022.06.004. Epub 2022 Jul 1.

Automated ICD coding using extreme multi-label long text transformer-based models.基于极端多标签长文本转换器的自动 ICD 编码。

Artif Intell Med. 2023 Oct;144:102662. doi: 10.1016/j.artmed.2023.102662. Epub 2023 Sep 7.

Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt.基于提示的自回归生成式多标签少样本ICD编码

Proc AAAI Conf Artif Intell. 2023 Jun 26;37(4):5366-5374. doi: 10.1609/aaai.v37i4.25668.

引用本文的文献

A review of evaluation approaches for explainable AI with applications in cardiology.用于可解释人工智能并应用于心脏病学的评估方法综述。

Artif Intell Rev. 2024;57(9):240. doi: 10.1007/s10462-024-10852-w. Epub 2024 Aug 9.

本文引用的文献

Comparison of different feature extraction methods for applicable automated ICD coding.不同特征提取方法在适用的自动化 ICD 编码中的比较。

BMC Med Inform Decis Mak. 2022 Jan 12;22(1):11. doi: 10.1186/s12911-022-01753-5.

Automated ICD coding for primary diagnosis via clinically interpretable machine learning.通过具有临床解释能力的机器学习实现主要诊断的自动化 ICD 编码。

Int J Med Inform. 2021 Sep;153:104543. doi: 10.1016/j.ijmedinf.2021.104543. Epub 2021 Jul 27.

ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network.使用多滤波器残差卷积神经网络从临床文本中进行ICD编码

Proc AAAI Conf Artif Intell. 2020 Feb;34(5):8180-8187. doi: 10.1609/aaai.v34i05.6331. Epub 2020 Apr 3.

Development of a novel machine learning model to predict presence of nonalcoholic steatohepatitis.开发一种新型机器学习模型以预测非酒精性脂肪性肝炎的存在。

J Am Med Inform Assoc. 2021 Jun 12;28(6):1235-1241. doi: 10.1093/jamia/ocab003.

Global, regional, and national burden of ischaemic heart disease and its attributable risk factors, 1990-2017: results from the Global Burden of Disease Study 2017.全球、区域和国家缺血性心脏病负担及其归因风险因素，1990-2017 年：2017 年全球疾病负担研究结果。

Eur Heart J Qual Care Clin Outcomes. 2022 Jan 5;8(1):50-60. doi: 10.1093/ehjqcco/qcaa076.

Explainable Prediction of Medical Codes With Knowledge Graphs.利用知识图谱对医学编码进行可解释预测。

Front Bioeng Biotechnol. 2020 Aug 14;8:867. doi: 10.3389/fbioe.2020.00867. eCollection 2020.

Automated ICD coding via unsupervised knowledge integration (UNITE).基于无监督知识集成的 ICD 编码自动化（UNITE）。

Int J Med Inform. 2020 Jul;139:104135. doi: 10.1016/j.ijmedinf.2020.104135. Epub 2020 Apr 4.

Construction of a semi-automatic ICD-10 coding system.构建一个半自动 ICD-10 编码系统。

BMC Med Inform Decis Mak. 2020 Apr 15;20(1):67. doi: 10.1186/s12911-020-1085-4.

UMLS mapping and Word embeddings for ICD code assignment using the MIMIC-III intensive care database.使用MIMIC-III重症监护数据库进行ICD编码分配的UMLS映射和词嵌入

Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:6089-6092. doi: 10.1109/EMBC.2019.8856442.

Automatic ICD Code Assignment based on ICD's Hierarchy Structure for Chinese Electronic Medical Records.基于ICD层次结构的中文电子病历自动ICD编码分配

AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:417-424. eCollection 2019.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种基于深度学习方法的冠心病自动ICD编码

Automated ICD coding for coronary heart diseases by a deep learning method.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献