用于药物警戒的深度学习：用于标记推特帖子中药物不良反应的循环神经网络架构

Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts.

作者信息

Cocos Anne, Fiks Alexander G, Masino Aaron J

机构信息

Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia Philadelphia, PA, USA.

出版信息

J Am Med Inform Assoc. 2017 Jul 1;24(4):813-821. doi: 10.1093/jamia/ocw180.

DOI:10.1093/jamia/ocw180

PMID:28339747

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7651964/

Abstract

OBJECTIVE

Social media is an important pharmacovigilance data source for adverse drug reaction (ADR) identification. Human review of social media data is infeasible due to data quantity, thus natural language processing techniques are necessary. Social media includes informal vocabulary and irregular grammar, which challenge natural language processing methods. Our objective is to develop a scalable, deep-learning approach that exceeds state-of-the-art ADR detection performance in social media.

MATERIALS AND METHODS

We developed a recurrent neural network (RNN) model that labels words in an input sequence with ADR membership tags. The only input features are word-embedding vectors, which can be formed through task-independent pretraining or during ADR detection training.

RESULTS

Our best-performing RNN model used pretrained word embeddings created from a large, non-domain-specific Twitter dataset. It achieved an approximate match F-measure of 0.755 for ADR identification on the dataset, compared to 0.631 for a baseline lexicon system and 0.65 for the state-of-the-art conditional random field model. Feature analysis indicated that semantic information in pretrained word embeddings boosted sensitivity and, combined with contextual awareness captured in the RNN, precision.

DISCUSSION

Our model required no task-specific feature engineering, suggesting generalizability to additional sequence-labeling tasks. Learning curve analysis showed that our model reached optimal performance with fewer training examples than the other models.

CONCLUSION

ADR detection performance in social media is significantly improved by using a contextually aware model and word embeddings formed from large, unlabeled datasets. The approach reduces manual data-labeling requirements and is scalable to large social media datasets.

摘要

目的

社交媒体是识别药物不良反应（ADR）的重要药物警戒数据源。由于数据量巨大，人工审核社交媒体数据并不可行，因此自然语言处理技术很有必要。社交媒体包含非正式词汇和不规则语法，这对自然语言处理方法构成了挑战。我们的目标是开发一种可扩展的深度学习方法，在社交媒体中超越当前最先进的ADR检测性能。

材料与方法

我们开发了一种循环神经网络（RNN）模型，该模型用ADR成员标签对输入序列中的单词进行标注。唯一的输入特征是词嵌入向量，其可以通过与任务无关的预训练或在ADR检测训练期间形成。

结果

我们表现最佳的RNN模型使用了从一个大型、非特定领域的推特数据集创建的预训练词嵌入。在该数据集上，其ADR识别的近似匹配F值达到了0.755，相比之下，基线词典系统为0.631，当前最先进的条件随机场模型为0.65。特征分析表明，预训练词嵌入中的语义信息提高了敏感性，并且与RNN中捕获的上下文感知相结合，提高了精确性。

讨论

我们的模型不需要特定于任务的特征工程，这表明它可推广到其他序列标注任务。学习曲线分析表明，与其他模型相比，我们的模型用更少的训练示例就达到了最佳性能。

结论

通过使用上下文感知模型和由大型未标注数据集形成的词嵌入，社交媒体中的ADR检测性能得到了显著提高。该方法减少了人工数据标注需求，并且可扩展到大型社交媒体数据集。

相似文献

Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts.用于药物警戒的深度学习：用于标记推特帖子中药物不良反应的循环神经网络架构

J Am Med Inform Assoc. 2017 Jul 1;24(4):813-821. doi: 10.1093/jamia/ocw180.

Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.社交媒体中的药物警戒：使用带有词嵌入聚类特征的序列标注挖掘药物不良反应提及信息。

J Am Med Inform Assoc. 2015 May;22(3):671-81. doi: 10.1093/jamia/ocu041. Epub 2015 Mar 9.

Classifying adverse drug reactions from imbalanced twitter data.从不平衡的推特数据中分类药物不良反应。

Int J Med Inform. 2019 Sep;129:122-132. doi: 10.1016/j.ijmedinf.2019.05.017. Epub 2019 May 30.

Adversarial neural network with sentiment-aware attention for detecting adverse drug reactions.具有情感感知注意力的对抗神经网络用于检测药物不良反应。

J Biomed Inform. 2021 Nov;123:103896. doi: 10.1016/j.jbi.2021.103896. Epub 2021 Sep 4.

Comment on: "Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts".评论：“用于药物警戒的深度学习：用于标记推特帖子中药物不良反应的循环神经网络架构”

J Am Med Inform Assoc. 2019 Jun 1;26(6):577-579. doi: 10.1093/jamia/ocz013.

Pharmacovigilance with Transformers: A Framework to Detect Adverse Drug Reactions Using BERT Fine-Tuned with FARM.基于 Transformer 的药物警戒：使用 FARM 微调的 BERT 检测药物不良反应的框架。

Comput Math Methods Med. 2021 Aug 13;2021:5589829. doi: 10.1155/2021/5589829. eCollection 2021.

Detecting Adverse Drug Reactions on Twitter with Convolutional Neural Networks and Word Embedding Features.利用卷积神经网络和词嵌入特征在推特上检测药物不良反应

J Healthc Inform Res. 2018 Apr 12;2(1-2):25-43. doi: 10.1007/s41666-018-0018-9. eCollection 2018 Jun.

Semi-Supervised Recurrent Neural Network for Adverse Drug Reaction mention extraction.基于半监督循环神经网络的药物不良反应提及抽取。

BMC Bioinformatics. 2018 Jun 13;19(Suppl 8):212. doi: 10.1186/s12859-018-2192-4.

Reply to comment on: "Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts".回复对“药物警戒中的深度学习：用于标记 Twitter 帖子中药物不良反应的循环神经网络架构”一文的评论。

J Am Med Inform Assoc. 2019 Jun 1;26(6):580-581. doi: 10.1093/jamia/ocy192.

Portable automatic text classification for adverse drug reaction detection via multi-corpus training.通过多语料库训练实现用于药物不良反应检测的便携式自动文本分类

J Biomed Inform. 2015 Feb;53:196-207. doi: 10.1016/j.jbi.2014.11.002. Epub 2014 Nov 8.

引用本文的文献

A Fusion Deep Learning Model for Predicting Adverse Drug Reactions Based on Multiple Drug Characteristics.一种基于多种药物特征预测药物不良反应的融合深度学习模型。

Life (Basel). 2025 Mar 10;15(3):436. doi: 10.3390/life15030436.

Effectiveness of Transformer-Based Large Language Models in Identifying Adverse Drug Reaction Relations from Unstructured Discharge Summaries in Singapore.基于Transformer的大语言模型在识别新加坡非结构化出院小结中的药物不良反应关系方面的有效性。

Drug Saf. 2025 Jun;48(6):667-677. doi: 10.1007/s40264-025-01525-w. Epub 2025 Feb 21.

Exploiting question-answer framework with multi-GRU to detect adverse drug reaction on social media.利用带有多门控循环单元的问答框架来检测社交媒体上的药物不良反应。

Sci Rep. 2025 Feb 4;15(1):4157. doi: 10.1038/s41598-025-87724-y.

Bidirectional Long Short-Term Memory-Based Detection of Adverse Drug Reaction Posts Using Korean Social Networking Services Data: Deep Learning Approaches.基于双向长短时记忆的利用韩国社交网络服务数据检测药物不良反应帖子：深度学习方法。

JMIR Med Inform. 2024 Nov 20;12:e45289. doi: 10.2196/45289.

Intelligent health in the IS area: A literature review and research agenda.信息系统领域的智能健康：文献综述与研究议程。

Fundam Res. 2023 May 11;4(4):961-971. doi: 10.1016/j.fmre.2023.04.008. eCollection 2024 Jul.

Predicting Drugs Suspected of Causing Adverse Drug Reactions Using Graph Features and Attention Mechanisms.利用图形特征和注意力机制预测疑似引起药物不良反应的药物

Pharmaceuticals (Basel). 2024 Jun 22;17(7):822. doi: 10.3390/ph17070822.

Transformers and large language models in healthcare: A review.医疗保健中的变压器和大型语言模型：综述。

Artif Intell Med. 2024 Aug;154:102900. doi: 10.1016/j.artmed.2024.102900. Epub 2024 Jun 5.

BiMPADR: A Deep Learning Framework for Predicting Adverse Drug Reactions in New Drugs.BiMPADR：一种用于预测新药不良反应的深度学习框架。

Molecules. 2024 Apr 14;29(8):1784. doi: 10.3390/molecules29081784.

Real-World Data and Evidence in Lung Cancer: A Review of Recent Developments.肺癌的真实世界数据与证据：近期进展综述

Cancers (Basel). 2024 Apr 4;16(7):1414. doi: 10.3390/cancers16071414.

Trends in using deep learning algorithms in biomedical prediction systems.生物医学预测系统中深度学习算法的应用趋势。

Front Neurosci. 2023 Nov 9;17:1256351. doi: 10.3389/fnins.2023.1256351. eCollection 2023.

本文引用的文献

Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts.情感分析对从推文和论坛帖子中提取药物不良反应的效果分析。

J Biomed Inform. 2016 Aug;62:148-58. doi: 10.1016/j.jbi.2016.06.007. Epub 2016 Jun 27.

MIMIC-III, a freely accessible critical care database.MIMIC-III，一个免费获取的重症监护数据库。

Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.

SOCIAL MEDIA MINING SHARED TASK WORKSHOP.社交媒体挖掘共享任务研讨会

Pac Symp Biocomput. 2016;21:581-92.

Pharmacovigilance on twitter? Mining tweets for adverse drug reactions.推特上的药物警戒？挖掘推文以获取药品不良反应信息。

AMIA Annu Symp Proc. 2014 Nov 14;2014:924-33. eCollection 2014.

J Am Med Inform Assoc. 2015 May;22(3):671-81. doi: 10.1093/jamia/ocu041. Epub 2015 Mar 9.

Portable automatic text classification for adverse drug reaction detection via multi-corpus training.通过多语料库训练实现用于药物不良反应检测的便携式自动文本分类

J Biomed Inform. 2015 Feb;53:196-207. doi: 10.1016/j.jbi.2014.11.002. Epub 2014 Nov 8.

Digital drug safety surveillance: monitoring pharmaceutical products in twitter.数字药品安全监测：在推特上监测药品

Drug Saf. 2014 May;37(5):343-50. doi: 10.1007/s40264-014-0155-x.

Adverse drug reactions of spontaneous reports in Shanghai pediatric population.上海儿科人群自发报告的药品不良反应

PLoS One. 2014 Feb 24;9(2):e89829. doi: 10.1371/journal.pone.0089829. eCollection 2014.

Clinical and economic burden of adverse drug reactions.药物不良反应的临床和经济负担。

J Pharmacol Pharmacother. 2013 Dec;4(Suppl 1):S73-7. doi: 10.4103/0976-500X.120957.

Pharmacovigilance using clinical notes.药物警戒利用临床记录。

Clin Pharmacol Ther. 2013 Jun;93(6):547-55. doi: 10.1038/clpt.2013.47. Epub 2013 Mar 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验