Yang Zhichao, Kwon Sunjae, Yao Zonghai, Yu Hong
College of Information and Computer Sciences, University of Massachusetts Amherst.
Department of Computer Science, University of Massachusetts Lowell.
Proc AAAI Conf Artif Intell. 2023 Jun 26;37(4):5366-5374. doi: 10.1609/aaai.v37i4.25668.
Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with an average of 3,000+ tokens. This task is challenging due to the high-dimensional space of multi-label assignment (155,000+ ICD code candidates) and the long-tail challenge - Many ICD codes are infrequently assigned yet infrequent ICD codes are important clinically. This study addresses the long-tail challenge by transforming this multi-label classification task into an autoregressive generation task. Specifically, we first introduce a novel pretraining objective to generate free text diagnoses and procedures using the SOAP structure, the medical logic physicians use for note documentation. Second, instead of directly predicting the high dimensional space of ICD codes, our model generates the lower dimension of text descriptions, which then infers ICD codes. Third, we designed a novel prompt template for multi-label classification. We evaluate our Generation with Prompt (GP) model with the benchmark of all code assignment (MIMIC-III-full) and few shot ICD code assignment evaluation benchmark (MIMIC-III-few). Experiments on MIMIC-III-few show that our model performs with a marco F130.2, which substantially outperforms the previous MIMIC-III-full SOTA model (marco F1 4.3) and the model specifically designed for few/zero shot setting (marco F1 18.7). Finally, we design a novel ensemble learner, a cross-attention reranker with prompts, to integrate previous SOTA and our best few-shot coding predictions. Experiments on MIMIC-III-full show that our ensemble learner substantially improves both macro and micro F1, from 10.4 to 14.6 and from 58.2 to 59.1, respectively.
自动国际疾病分类(ICD)编码旨在为平均包含3000多个词元的医学记录分配多个ICD编码。由于多标签分配的高维空间(超过155,000个ICD编码候选)以及长尾挑战,这项任务具有挑战性——许多ICD编码很少被分配,但罕见的ICD编码在临床上很重要。本研究通过将此多标签分类任务转化为自回归生成任务来应对长尾挑战。具体而言,我们首先引入一种新颖的预训练目标,使用SOAP结构(医生用于记录的医学逻辑)生成自由文本诊断和程序。其次,我们的模型不是直接预测ICD编码的高维空间,而是生成文本描述的低维表示,然后据此推断ICD编码。第三,我们设计了一种新颖的多标签分类提示模板。我们使用所有编码分配基准(MIMIC-III-full)和少样本ICD编码分配评估基准(MIMIC-III-few)对我们的带提示生成(GP)模型进行评估。在MIMIC-III-few上的实验表明,我们的模型的宏F1为30.2,大大优于之前的MIMIC-III-full最优模型(宏F1为4.3)以及专门为少样本/零样本设置设计的模型(宏F1为18.7)。最后,我们设计了一种新颖的集成学习器,即带提示的交叉注意力重排器,以整合之前的最优模型和我们最佳的少样本编码预测。在MIMIC-III-full上的实验表明,我们的集成学习器显著提高了宏F1和微F1,分别从10.4提高到14.6,从58.2提高到59.1。