用于 ICD-10 编码分类任务的哪种类型的转换器模型。

What Kind of Transformer Models to Use for the ICD-10 Codes Classification Task.

机构信息

Bern University of Applied Sciences, Switzerland.

出版信息

Stud Health Technol Inform. 2024 Aug 22;316:1008-1012. doi: 10.3233/SHTI240580.

Abstract

Coding according to the International Classification of Diseases (ICD)-10 and its clinical modifications (CM) is inherently complex and expensive. Natural Language Processing (NLP) assists by simplifying the analysis of unstructured data from electronic health records, thereby facilitating diagnosis coding. This study investigates the suitability of transformer models for ICD-10 classification, considering both encoder and encoder-decoder architectures. The analysis is performed on clinical discharge summaries from the Medical Information Mart for Intensive Care (MIMIC)-IV dataset, which contains an extensive collection of electronic health records. Pre-trained models such as BioBERT, ClinicalBERT, ClinicalLongformer, and ClinicalBigBird are adapted for the coding task, incorporating specific preprocessing techniques to enhance performance. The findings indicate that increasing context length improves accuracy, and that the difference in accuracy between encoder and encoder-decoder models is negligible.

摘要

根据国际疾病分类（ICD-10）及其临床修订版（CM）进行编码本身就很复杂且费用高昂。自然语言处理（NLP）通过简化电子健康记录中非结构化数据的分析，从而辅助诊断编码。本研究考察了转换器模型在 ICD-10 分类中的适用性，同时考虑了编码器和编码器-解码器架构。分析基于包含大量电子健康记录的医疗信息监护（MIMIC-IV）数据集的临床出院总结进行。为了进行编码任务，我们对 BioBERT、ClinicalBERT、ClinicalLongformer 和 ClinicalBigBird 等预训练模型进行了适配，并采用了特定的预处理技术来提高性能。研究结果表明，增加上下文长度可以提高准确性，而且编码器和编码器-解码器模型之间的准确性差异可以忽略不计。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于 ICD-10 编码分类任务的哪种类型的转换器模型。

What Kind of Transformer Models to Use for the ICD-10 Codes Classification Task.

机构信息

出版信息

相似文献

用于 ICD-10 编码分类任务的哪种类型的转换器模型。

What Kind of Transformer Models to Use for the ICD-10 Codes Classification Task.

机构信息

出版信息

相似文献