Suppr超能文献

深度学习方法用于药物处置和相应属性提取。

A deep learning approach for medication disposition and corresponding attributes extraction.

机构信息

VA Salt Lake City Health Care System, 500, Foothill Boulevard, Salt Lake City 84148, USA; Division of Epidemiology, University of Utah, 295 Chipeta Way, Salt Lake City 84132, USA.

Division of Epidemiology, University of Utah, 295 Chipeta Way, Salt Lake City 84132, USA; Veterans Health Administration Office of Analytics and Performance Integration, 500, Foothill Boulevard, Salt Lake City 84148, USA.

出版信息

J Biomed Inform. 2023 Jul;143:104391. doi: 10.1016/j.jbi.2023.104391. Epub 2023 May 15.

Abstract

OBJECTIVE

This article summarizes our approach to extracting medication and corresponding attributes from clinical notes, which is the focus of track 1 of the 2022 National Natural Language Processing (NLP) Clinical Challenges(n2c2) shared task.

METHODS

The dataset was prepared using Contextualized Medication Event Dataset (CMED), including 500 notes from 296 patients. Our system consisted of three components: medication named entity recognition (NER), event classification (EC), and context classification (CC). These three components were built using transformer models with slightly different architecture and input text engineering. A zero-shot learning solution for CC was also explored.

RESULTS

Our best performance systems achieved micro-average F1 scores of 0.973, 0.911, and 0.909 for the NER, EC, and CC, respectively.

CONCLUSION

In this study, we implemented a deep learning-based NLP system and demonstrated that our approach of (1) utilizing special tokens helps our model to distinguish multiple medications mentions in the same context; (2) aggregating multiple events of a single medication into multiple labels improves our model's performance.

摘要

目的

本文总结了我们从临床记录中提取药物和相应属性的方法,这是 2022 年全国自然语言处理(NLP)临床挑战(n2c2)的第 1 轨道的重点。

方法

该数据集使用上下文药物事件数据集(CMED)准备,包括 296 名患者的 500 份记录。我们的系统由三个组件组成:药物命名实体识别(NER)、事件分类(EC)和上下文分类(CC)。这三个组件均使用具有略有不同架构和输入文本工程的转换器模型构建。还探索了 CC 的零样本学习解决方案。

结果

我们的最佳性能系统在 NER、EC 和 CC 方面的微平均 F1 分数分别达到 0.973、0.911 和 0.909。

结论

在这项研究中,我们实现了一个基于深度学习的 NLP 系统,并证明了我们的方法:(1)利用特殊标记符有助于我们的模型在同一上下文中区分多种药物提及;(2)将单一药物的多个事件聚合到多个标签中,提高了我们模型的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验