基于具有不同序列标注模型和词向量的级联架构，从电子健康记录中提取药物不良事件和药物信息。

Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings.

机构信息

Department of Electrical Engineering, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan.

Department of Post-Baccalaureate Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan.

出版信息

J Am Med Inform Assoc. 2020 Jan 1;27(1):47-55. doi: 10.1093/jamia/ocz120.

DOI:10.1093/jamia/ocz120

PMID:31334805

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7489070/

Abstract

OBJECTIVE

An adverse drug event (ADE) refers to an injury resulting from medical intervention related to a drug including harm caused by drugs or from the usage of drugs. Extracting ADEs from clinical records can help physicians associate adverse events to targeted drugs.

MATERIALS AND METHODS

We proposed a cascading architecture to recognize medical concepts including ADEs, drug names, and entities related to drugs. The architecture includes a preprocessing method and an ensemble of conditional random fields (CRFs) and neural network-based models to respectively address the challenges of surrogate string and overlapping annotation boundaries observed in the employed ADEs and medication extraction (ADME) corpus. The effectiveness of applying different pretrained and postprocessed word embeddings for the ADME task was also studied.

RESULTS

The empirical results showed that both CRFs and neural network-based models provide promising solution for the ADME task. The neural network-based models particularly outperformed CRFs in concept types involving narrative descriptions. Our best run achieved an overall micro F-score of 0.919 on the employed corpus. Our results also suggested that the Global Vectors for word representation embedding in general domain provides a very strong baseline, which can be further improved by applying the principal component analysis to generate more isotropic vectors.

CONCLUSIONS

We have demonstrated that the proposed cascading architecture can handle the problem of overlapped annotations and further improve the overall recall and F-scores because the architecture enables the developed models to exploit more context information and forms an ensemble for creating a stronger recognizer.

摘要

目的

药物不良事件（ADE）是指与药物相关的医疗干预引起的伤害，包括药物引起的伤害或药物使用引起的伤害。从临床记录中提取 ADE 可以帮助医生将不良事件与目标药物联系起来。

材料与方法

我们提出了一种级联架构来识别包括 ADE、药物名称和与药物相关的实体在内的医学概念。该架构包括预处理方法和条件随机场（CRF）和基于神经网络的模型的集合，分别解决了在使用的 ADE 和药物提取（ADME）语料库中观察到的替代字符串和重叠注释边界的挑战。还研究了应用不同的预训练和后处理词嵌入对 ADME 任务的有效性。

结果

实验结果表明，CRF 和基于神经网络的模型都为 ADME 任务提供了有前途的解决方案。基于神经网络的模型在涉及叙述描述的概念类型中尤其优于 CRF。我们的最佳运行在使用的语料库上实现了总体微观 F 分数为 0.919。我们的结果还表明，用于一般领域的单词表示嵌入的 Global Vectors 提供了一个非常强大的基线，通过应用主成分分析生成更各向同性的向量可以进一步提高。

结论

我们已经证明，所提出的级联架构可以处理重叠注释的问题，并进一步提高整体召回率和 F 分数，因为该架构使开发的模型能够利用更多的上下文信息，并形成一个集合来创建一个更强的识别器。

相似文献

Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings.基于具有不同序列标注模型和词向量的级联架构，从电子健康记录中提取药物不良事件和药物信息。

J Am Med Inform Assoc. 2020 Jan 1;27(1):47-55. doi: 10.1093/jamia/ocz120.

Adverse drug events and medication relation extraction in electronic health records with ensemble deep learning methods.基于集成深度学习方法的电子健康记录中的药物不良反应和药物关系提取。

J Am Med Inform Assoc. 2020 Jan 1;27(1):39-46. doi: 10.1093/jamia/ocz101.

Adverse Drug Event Detection from Electronic Health Records Using Hierarchical Recurrent Neural Networks with Dual-Level Embedding.基于具有双层嵌入的层次递归神经网络从电子健康记录中检测药物不良反应。

Drug Saf. 2019 Jan;42(1):113-122. doi: 10.1007/s40264-018-0765-9.

An ensemble of neural models for nested adverse drug events and medication extraction with subwords.基于子词的嵌套不良药物事件和药物提取的神经模型集合。

J Am Med Inform Assoc. 2020 Jan 1;27(1):22-30. doi: 10.1093/jamia/ocz075.

A study of deep learning approaches for medication and adverse drug event extraction from clinical text.深度学习方法在从临床文本中提取药物和药物不良事件的研究。

J Am Med Inform Assoc. 2020 Jan 1;27(1):13-21. doi: 10.1093/jamia/ocz063.

Ensemble method-based extraction of medication and related information from clinical texts.基于集成方法的临床文本中药物及相关信息的提取。

J Am Med Inform Assoc. 2020 Jan 1;27(1):31-38. doi: 10.1093/jamia/ocz100.

Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning.利用结合知识库和深度学习的自然语言处理系统提取药物和相关药物不良事件。

J Am Med Inform Assoc. 2020 Jan 1;27(1):56-64. doi: 10.1093/jamia/ocz141.

Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting.利用递归卷积神经网络和梯度提升来识别药物与药物不良事件之间的关系。

J Am Med Inform Assoc. 2020 Jan 1;27(1):65-72. doi: 10.1093/jamia/ocz144.

Ontology-Based Healthcare Named Entity Recognition from Twitter Messages Using a Recurrent Neural Network Approach.基于本体的推特消息中医疗命名实体识别的递归神经网络方法。

Int J Environ Res Public Health. 2019 Sep 27;16(19):3628. doi: 10.3390/ijerph16193628.

Extraction of Information Related to Drug Safety Surveillance From Electronic Health Record Notes: Joint Modeling of Entities and Relations Using Knowledge-Aware Neural Attentive Models.从电子健康记录笔记中提取与药物安全监测相关的信息：使用知识感知神经注意力模型对实体和关系进行联合建模

JMIR Med Inform. 2020 Jul 10;8(7):e18417. doi: 10.2196/18417.

引用本文的文献

Named Entity Recognition in Electronic Health Records: A Methodological Review.电子健康记录中的命名实体识别：方法学综述

Healthc Inform Res. 2023 Oct;29(4):286-300. doi: 10.4258/hir.2023.29.4.286. Epub 2023 Oct 31.

Can Race-sensitive Biomedical Embeddings Improve Healthcare Predictive Models?种族敏感的生物医学嵌入能否改善医疗保健预测模型？

AMIA Jt Summits Transl Sci Proc. 2023 Jun 16;2023:388-397. eCollection 2023.

Adverse drug event detection using natural language processing: A scoping review of supervised learning methods.基于自然语言处理的药物不良反应检测：监督学习方法的范围综述。

PLoS One. 2023 Jan 3;18(1):e0279842. doi: 10.1371/journal.pone.0279842. eCollection 2023.

Machine learning approaches for electronic health records phenotyping: a methodical review.基于机器学习的电子健康记录表型分析方法：系统评价

J Am Med Inform Assoc. 2023 Jan 18;30(2):367-381. doi: 10.1093/jamia/ocac216.

Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching.临床记录中药物不良反应识别流程的开发：词嵌入模型与字符串匹配

JMIR Med Inform. 2022 Jan 25;10(1):e31063. doi: 10.2196/31063.

Extracting Drug Names and Associated Attributes From Discharge Summaries: Text Mining Study.从出院小结中提取药物名称及相关属性：文本挖掘研究

JMIR Med Inform. 2021 May 5;9(5):e24678. doi: 10.2196/24678.

Hybrid Deep Learning for Medication-Related Information Extraction From Clinical Texts in French: MedExt Algorithm Development Study.用于从法语临床文本中提取药物相关信息的混合深度学习：MedExt算法开发研究

JMIR Med Inform. 2021 Mar 16;9(3):e17934. doi: 10.2196/17934.

Deep Learning-Based Natural Language Processing for Screening Psychiatric Patients.基于深度学习的自然语言处理用于筛查精神科患者

Front Psychiatry. 2021 Jan 15;11:533949. doi: 10.3389/fpsyt.2020.533949. eCollection 2020.

Investigation of the characteristics of medication errors and adverse drug reactions using pharmacovigilance data in China.利用中国药物警戒数据调查用药差错和药品不良反应的特征。

Saudi Pharm J. 2020 Oct;28(10):1190-1196. doi: 10.1016/j.jsps.2020.08.008. Epub 2020 Aug 21.

JMIR Med Inform. 2020 Jul 10;8(7):e18417. doi: 10.2196/18417.

本文引用的文献

A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。

J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

MIMIC-III, a freely accessible critical care database.MIMIC-III，一个免费获取的重症监护数据库。

Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.

A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text.用于临床文本中命名实体识别的神经词嵌入研究

AMIA Annu Symp Proc. 2015 Nov 5;2015:1326-33. eCollection 2015.

A context-aware approach for progression tracking of medical concepts in electronic medical records.一种用于电子病历中医学概念进展跟踪的上下文感知方法。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S150-S157. doi: 10.1016/j.jbi.2015.09.013. Epub 2015 Sep 30.

Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields.使用基于词元的公式和条件随机字段识别与评估临床文档中的临床章节标题

Biomed Res Int. 2015;2015:873012. doi: 10.1155/2015/873012. Epub 2015 Aug 26.

Text mining for adverse drug events: the promise, challenges, and state of the art.药物不良事件的文本挖掘：前景、挑战与现状

Drug Saf. 2014 Oct;37(10):777-90. doi: 10.1007/s40264-014-0218-z.

Knowledge-based extraction of adverse drug events from biomedical text.基于知识的生物医学文本中不良药物事件的提取。

BMC Bioinformatics. 2014 Mar 4;15:64. doi: 10.1186/1471-2105-15-64.

An enhanced CRFs-based system for information extraction from radiology reports.基于增强型条件随机场的放射学报告信息抽取系统。

J Biomed Inform. 2013 Jun;46(3):425-35. doi: 10.1016/j.jbi.2013.01.006. Epub 2013 Feb 11.

Evaluating standard terminologies for encoding allergy information.评估用于编码过敏信息的标准术语。

J Am Med Inform Assoc. 2013 Sep-Oct;20(5):969-79. doi: 10.1136/amiajnl-2012-000816. Epub 2013 Feb 9.

Extraction of potential adverse drug events from medical case reports.从医疗病例报告中提取潜在的药物不良事件。

J Biomed Semantics. 2012 Dec 20;3(1):15. doi: 10.1186/2041-1480-3-15.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验