增强 aer2vec：利用正字法和词汇信息丰富不良事件报告数据的分布式表示。

Augmenting aer2vec: Enriching distributed representations of adverse event report data with orthographic and lexical information.

机构信息

Department of Biomedical Informatics & Medical Education, University of Washington, Seattle, WA, USA.

Department of Computer Science, Rice University, Houston, TX, USA.

出版信息

J Biomed Inform. 2021 Jul;119:103833. doi: 10.1016/j.jbi.2021.103833. Epub 2021 Jun 8.

DOI:10.1016/j.jbi.2021.103833

PMID:34111555

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8260467/

Abstract

Adverse Drug Events (ADEs) are prevalent, costly, and sometimes preventable. Post-marketing drug surveillance aims to monitor ADEs that occur after a drug is released to market. Reports of such ADEs are aggregated by reporting systems, such as the Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). In this paper, we consider the topic of how best to represent data derived from reports in FAERS for the purpose of detecting post-marketing surveillance signals, in order to inform regulatory decision making. In our previous work, we developed aer2vec, a method for deriving distributed representations (concept embeddings) of drugs and side effects from ADE reports, establishing the utility of distributional information for pharmacovigilance signal detection. In this paper, we advance this line of research further by evaluating the utility of encoding orthographic and lexical information. We do so by adapting two Natural Language Processing methods, subword embedding and vector retrofitting, which were developed to encode such information into word embeddings. Models were compared for their ability to distinguish between positive and negative examples in a set of manually curated drug/ADE relationships, with both aer2vec enhancements offering advantages in performances over baseline models, and best performance obtained when retrofitting and subword embeddings were applied in concert. In addition, this work demonstrates that models leveraging distributed representations do not require extensive manual preprocessing to perform well on this pharmacovigilance signal detection task, and may even benefit from information that would otherwise be lost during the normalization and standardization process.

摘要

药物不良反应（ADE）普遍存在、代价高昂，有时甚至可以预防。上市后药物监测旨在监测药物上市后发生的 ADE。此类 ADE 的报告由报告系统汇总，例如美国食品和药物管理局（FDA）不良事件报告系统（FAERS）。在本文中，我们考虑了如何最好地表示从 FAERS 报告中得出的数据，以便检测上市后监测信号，从而为监管决策提供信息。在我们之前的工作中，我们开发了 aer2vec，这是一种从 ADE 报告中提取药物和副作用分布式表示（概念嵌入）的方法，为药物警戒信号检测建立了分布信息的实用性。在本文中，我们通过评估编码正字法和词汇信息的效用进一步推进了这一研究。我们通过适应两种自然语言处理方法，子词嵌入和向量重构，来实现这一点，这两种方法旨在将此类信息编码到词嵌入中。我们比较了模型在一组手动策划的药物/ADE 关系中区分正例和负例的能力，与基线模型相比，aer2vec 的增强版本在性能上都具有优势，而在同时应用重构和子词嵌入时获得了最佳性能。此外，这项工作表明，利用分布式表示的模型不需要进行大量的手动预处理即可在这项药物警戒信号检测任务中表现良好，甚至可能受益于在规范化和标准化过程中丢失的信息。

相似文献

Augmenting aer2vec: Enriching distributed representations of adverse event report data with orthographic and lexical information.

J Biomed Inform. 2021 Jul;119:103833. doi: 10.1016/j.jbi.2021.103833. Epub 2021 Jun 8.

Retrofitting Vector Representations of Adverse Event Reporting Data to Structured Knowledge to Improve Pharmacovigilance Signal Detection.

AMIA Annu Symp Proc. 2021 Jan 25;2020:383-392. eCollection 2020.

Leveraging MEDLINE indexing for pharmacovigilance - Inherent limitations and mitigation strategies.

J Biomed Inform. 2015 Oct;57:425-35. doi: 10.1016/j.jbi.2015.08.022. Epub 2015 Sep 2.

: Distributed Representations of Adverse Event Reporting System Data as a Means to Identify Drug/Side-Effect Associations.

AMIA Annu Symp Proc. 2020 Mar 4;2019:717-726. eCollection 2019.

Evaluation of Natural Language Processing (NLP) systems to annotate drug product labeling with MedDRA terminology.

J Biomed Inform. 2018 Jul;83:73-86. doi: 10.1016/j.jbi.2018.05.019. Epub 2018 Jun 1.

Artificial Intelligent Context-Aware Machine-Learning Tool to Detect Adverse Drug Events from Social Media Platforms.

J Med Toxicol. 2022 Oct;18(4):311-320. doi: 10.1007/s13181-022-00906-2. Epub 2022 Sep 12.

Complementing Observational Signals with Literature-Derived Distributed Representations for Post-Marketing Drug Surveillance.

Drug Saf. 2020 Jan;43(1):67-77. doi: 10.1007/s40264-019-00872-9.

Feature engineering and machine learning for causality assessment in pharmacovigilance: Lessons learned from application to the FDA Adverse Event Reporting System.

Comput Biol Med. 2021 Aug;135:104517. doi: 10.1016/j.compbiomed.2021.104517. Epub 2021 Jun 8.

Mining association patterns of drug-interactions using post marketing FDA's spontaneous reporting data.

J Biomed Inform. 2016 Apr;60:294-308. doi: 10.1016/j.jbi.2016.02.009. Epub 2016 Feb 20.

A Pharmacovigilance Signaling System Based on FDA Regulatory Action and Post-Marketing Adverse Event Reports.

Drug Saf. 2016 Jun;39(6):561-75. doi: 10.1007/s40264-016-0409-x.

引用本文的文献

Natural language processing for detecting adverse drug events: A systematic review protocol.

NIHR Open Res. 2024 Dec 10;3:67. doi: 10.3310/nihropenres.13504.2. eCollection 2023.

Year 2021: COVID-19, Information Extraction and BERTization among the Hottest Topics in Medical Natural Language Processing.

Yearb Med Inform. 2022 Aug;31(1):254-260. doi: 10.1055/s-0042-1742547. Epub 2022 Dec 4.

本文引用的文献

Retrofitting Vector Representations of Adverse Event Reporting Data to Structured Knowledge to Improve Pharmacovigilance Signal Detection.

AMIA Annu Symp Proc. 2021 Jan 25;2020:383-392. eCollection 2020.

Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies.

J Am Med Inform Assoc. 2020 Oct 1;27(10):1593-1599. doi: 10.1093/jamia/ocaa180.

: Distributed Representations of Adverse Event Reporting System Data as a Means to Identify Drug/Side-Effect Associations.

AMIA Annu Symp Proc. 2020 Mar 4;2019:717-726. eCollection 2019.

Evaluation of quantitative signal detection in EudraVigilance for orphan drugs: possible risk of false negatives.

Ther Adv Drug Saf. 2019 Oct 21;10:2042098619882819. doi: 10.1177/2042098619882819. eCollection 2019.

A Review of the FAERS Data on 5-Alpha Reductase Inhibitors: Implications for Postfinasteride Syndrome.

Urology. 2018 Oct;120:143-149. doi: 10.1016/j.urology.2018.06.022. Epub 2018 Jun 27.

Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness.

Stud Health Technol Inform. 2017;245:657-661.

Toward multimodal signal detection of adverse drug reactions.

J Biomed Inform. 2017 Dec;76:41-49. doi: 10.1016/j.jbi.2017.10.013. Epub 2017 Nov 1.

Empirical estimation of under-reporting in the U.S. Food and Drug Administration Adverse Event Reporting System (FAERS).

Expert Opin Drug Saf. 2017 Jul;16(7):761-767. doi: 10.1080/14740338.2017.1323867. Epub 2017 May 9.

A curated and standardized adverse drug event resource to accelerate drug safety research.

Sci Data. 2016 May 10;3:160026. doi: 10.1038/sdata.2016.26.

Use of data mining at the Food and Drug Administration.

J Am Med Inform Assoc. 2016 Mar;23(2):428-34. doi: 10.1093/jamia/ocv063. Epub 2015 Jul 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

增强 aer2vec：利用正字法和词汇信息丰富不良事件报告数据的分布式表示。

Augmenting aer2vec: Enriching distributed representations of adverse event report data with orthographic and lexical information.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献