Gao Yanjun, Miller Timothy, Xu Dongfang, Dligach Dmitriy, Churpek Matthew M, Afshar Majid
ICU Data Science Lab, School of Medicine and Public Health, University of Wisconsin-Madison.
Boston Children's Hospital, and Harvard Medical School.
Proc Int Conf Comput Ling. 2022 Oct;2022:2979-2991.
Automatically summarizing patients' main problems from daily progress notes using natural language processing methods helps to battle against information and cognitive overload in hospital settings and potentially assists providers with computerized diagnostic decision support. Problem list summarization requires a model to understand, abstract, and generate clinical documentation. In this work, we propose a new NLP task that aims to generate a list of problems in a patient's daily care plan using input from the provider's progress notes during hospitalization. We investigate the performance of T5 and BART, two state-of-the-art seq2seq transformer architectures, in solving this problem. We provide a corpus built on top of progress notes from publicly available electronic health record progress notes in the Medical Information Mart for Intensive Care (MIMIC)-III. T5 and BART are trained on general domain text, and we experiment with a data augmentation method and a domain adaptation pre-training method to increase exposure to medical vocabulary and knowledge. Evaluation methods include ROUGE, BERTScore, cosine similarity on sentence embedding, and F-score on medical concepts. Results show that T5 with domain adaptive pre-training achieves significant performance gains compared to a rule-based system and general domain pre-trained language models, indicating a promising direction for tackling the problem summarization task.
使用自然语言处理方法从日常病程记录中自动总结患者的主要问题,有助于应对医院环境中的信息和认知过载,并可能为医护人员提供计算机化诊断决策支持。问题列表总结需要一个模型来理解、抽象并生成临床文档。在这项工作中,我们提出了一项新的自然语言处理任务,旨在利用患者住院期间医护人员病程记录中的输入信息,生成患者日常护理计划中的问题列表。我们研究了两种最先进的序列到序列(seq2seq)变压器架构T5和BART在解决这个问题时的性能。我们提供了一个基于重症监护医学信息集市(MIMIC)-III中公开可用的电子健康记录病程记录构建的语料库。T5和BART是在通用领域文本上进行训练的,我们试验了一种数据增强方法和一种领域适应预训练方法,以增加对医学词汇和知识的接触。评估方法包括ROUGE、BERTScore、句子嵌入上的余弦相似度以及医学概念上的F分数。结果表明,与基于规则的系统和通用领域预训练语言模型相比,经过领域自适应预训练的T5在性能上有显著提升,这表明在解决问题总结任务方面有一个很有前景的方向。