Department of Biomedical Informatics & Medical Education, University of Washington, Seattle, WA 98195, United States.
Department of Information Sciences and Technology, George Mason University, Fairfax, VA 22030, United States.
J Am Med Inform Assoc. 2024 Nov 1;31(11):2583-2594. doi: 10.1093/jamia/ocae231.
Clinical notes contain unstructured representations of patient histories, including the relationships between medical problems and prescription drugs. To investigate the relationship between cancer drugs and their associated symptom burden, we extract structured, semantic representations of medical problem and drug information from the clinical narratives of oncology notes.
We present Clinical concept Annotations for Cancer Events and Relations (CACER), a novel corpus with fine-grained annotations for over 48 000 medical problems and drug events and 10 000 drug-problem and problem-problem relations. Leveraging CACER, we develop and evaluate transformer-based information extraction models such as Bidirectional Encoder Representations from Transformers (BERT), Fine-tuned Language Net Text-To-Text Transfer Transformer (Flan-T5), Large Language Model Meta AI (Llama3), and Generative Pre-trained Transformers-4 (GPT-4) using fine-tuning and in-context learning (ICL).
In event extraction, the fine-tuned BERT and Llama3 models achieved the highest performance at 88.2-88.0 F1, which is comparable to the inter-annotator agreement (IAA) of 88.4 F1. In relation extraction, the fine-tuned BERT, Flan-T5, and Llama3 achieved the highest performance at 61.8-65.3 F1. GPT-4 with ICL achieved the worst performance across both tasks.
The fine-tuned models significantly outperformed GPT-4 in ICL, highlighting the importance of annotated training data and model optimization. Furthermore, the BERT models performed similarly to Llama3. For our task, large language models offer no performance advantage over the smaller BERT models.
We introduce CACER, a novel corpus with fine-grained annotations for medical problems, drugs, and their relationships in clinical narratives of oncology notes. State-of-the-art transformer models achieved performance comparable to IAA for several extraction tasks.
临床记录包含患者病史的非结构化表示,包括医疗问题与处方药物之间的关系。为了研究癌症药物与其相关症状负担之间的关系,我们从肿瘤学记录的临床描述中提取了医疗问题和药物信息的结构化、语义表示。
我们提出了临床概念注释癌症事件和关系(CACER),这是一个具有超过 48000 个医疗问题和药物事件以及 10000 个药物-问题和问题-问题关系的精细注释的新型语料库。利用 CACER,我们开发和评估了基于转换器的信息提取模型,如双向编码器表示转换器(BERT)、微调语言网文本到文本转移转换器(Flan-T5)、大型语言模型元人工智能(Llama3)和生成式预训练转换器-4(GPT-4),使用微调技术和上下文学习(ICL)。
在事件提取中,微调的 BERT 和 Llama3 模型在 88.2-88.0 的 F1 上表现最佳,与 88.4 的 F1 一致性指标(IAA)相当。在关系提取中,微调的 BERT、Flan-T5 和 Llama3 模型在 61.8-65.3 的 F1 上表现最佳。GPT-4 在 ICL 中的表现最差。
微调的模型在 ICL 中明显优于 GPT-4,这突出了标注训练数据和模型优化的重要性。此外,BERT 模型的表现与 Llama3 相似。对于我们的任务,大型语言模型在性能上并没有优于较小的 BERT 模型。
我们引入了 CACER,这是一个具有精细注释的新型语料库,用于肿瘤学记录中的临床描述中的医疗问题、药物及其关系。最先进的转换器模型在几个提取任务中达到了与 IAA 相当的性能。