Suppr超能文献

一种基于语境的多任务神经网络方法,用于从临床文本中识别药物和不良事件。

A contextual multi-task neural approach to medication and adverse events identification from clinical text.

机构信息

Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India.

School of Mathematics, Georgia Institute of Technology, Atlanta, GA, USA.

出版信息

J Biomed Inform. 2022 Jan;125:103960. doi: 10.1016/j.jbi.2021.103960. Epub 2021 Dec 4.

Abstract

Effective wide-scale pharmacovigilance calls for accurate named entity recognition (NER) of medication entities such as drugs, dosages, reasons, and adverse drug events (ADE) from clinical text. The scarcity of adverse event annotations and underlying semantic ambiguities make accurate scope identification challenging. The current research explores integrating contextualized language models and multi-task learning from diverse clinical NER datasets to mitigate this challenge. We propose a novel multi-task adaptation method to refine the embeddings generated by the Bidirectional Encoder Representations from Transformers (BERT) language model to improve inter-task knowledge sharing. We integrated the adapted BERT model into a unique hierarchical multi-task neural network comprised of the medication and auxiliary clinical NER tasks. We validated the model using two different versions of BERT on diverse well-studied clinical tasks: Medication and ADE (n2c2 2018/n2c2 2009), Clinical Concepts (n2c2 2010/n2c2 2012), Disorders (ShAReCLEF 2013). Overall medication extraction performance enhanced by up to +1.19 F1 (n2c2 2018) while generalization enhanced by +5.38 F1 (n2c2 2009) as compared to standalone BERT baselines. ADE recognition enhanced significantly (McNemar's test), out-performing prior baselines. Similar benefits were observed on the auxiliary clinical and disorder tasks. We demonstrate that combining multi-dataset BERT adaptation and multi-task learning out-performs prior medication extraction methods without requiring additional features, newer training data, or ensembling. Taken together, the study contributes an initial case study towards integrating diverse clinical datasets in an end-to-end NER model for clinical decision support.

摘要

有效的大规模药物警戒需要准确识别药物、剂量、原因和不良药物事件(ADE)等药物实体的命名实体识别(NER)。由于不良事件注释的稀缺性和潜在语义歧义,准确确定范围具有挑战性。当前的研究探讨了从各种临床 NER 数据集中整合上下文语言模型和多任务学习,以减轻这一挑战。我们提出了一种新的多任务自适应方法,以改进从 Transformer (BERT)语言模型生成的嵌入,从而提高任务间的知识共享。我们将经过自适应的 BERT 模型集成到一个独特的分层多任务神经网络中,该网络由药物和辅助临床 NER 任务组成。我们使用两种不同版本的 BERT 在各种经过充分研究的临床任务上验证了该模型:药物和 ADE(n2c2 2018/n2c2 2009)、临床概念(n2c2 2010/n2c2 2012)、疾病(ShAReCLEF 2013)。与独立的 BERT 基线相比,总体药物提取性能提高了+1.19 F1(n2c2 2018),而泛化性能提高了+5.38 F1(n2c2 2009)。ADE 识别性能显著提高(McNemar 检验),优于先前的基线。在辅助临床和疾病任务中也观察到了类似的收益。我们证明,结合多数据集 BERT 自适应和多任务学习,优于无需额外特征、更新训练数据或集成的先前药物提取方法。总之,该研究提供了一个初步的案例研究,即将各种临床数据集集成到一个端到端的 NER 模型中,以支持临床决策。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验