Department of Pharmacy Practice and Administration, Western University of Health Sciences, 309 E 2nd St, Pomona, CA, 91766, USA.
Department of Pharmacy Practice, Chapman University, 9401 Geronimo Rd, Irvine, CA, 92618, USA.
J Med Toxicol. 2022 Oct;18(4):311-320. doi: 10.1007/s13181-022-00906-2. Epub 2022 Sep 12.
Pharmacovigilance (PV) has proven to detect post-marketing adverse drug events (ADE). Previous research used the natural language processing (NLP) tool to extract unstructured texts relevant to ADEs. However, texts without context reduce the efficiency of such algorithms. Our objective was to develop and validate an innovative NLP tool, aTarantula, using a context-aware machine-learning algorithm to detect existing ADEs from social media using an aggregated lexicon.
aTarantula utilized FastText embeddings and an aggregated lexicon to extract contextual data from three patient forums (i.e., MedHelp, MedsChat, and PatientInfo) taking warfarin. The lexicon used warfarin package inserts and synonyms of warfarin ADEs from UMLS and FAERS databases. Data was stored on SQLite and then refined and manually checked by three clinical pharmacists for validation.
Multiple organ systems where the most frequent ADE were reported at 1.50%, followed by CNS side effects at 1.19%. Lymphatic system ADEs were the least common side effect reported at 0.09%. The overall Spearman rank correlation coefficient between patient-reported data from the forums and FAERS was 0.19. As determined by pharmacist validation, aTarantula had a sensitivity of 84.2% and a specificity of 98%. Three clinical pharmacists manually validated our results. Finally, we created an aggregated lexicon for mining ADEs from social media.
We successfully developed aTarantula, a machine-learning algorithmn based on artificial intelligence to extract warfarin-related ADEs from online social discussion forums automatically. Our study shows that it is feasible to use aTarantula to detect ADEs. Future researchers can validate aTarantula on the diverse dataset.
药物警戒(PV)已被证明可用于检测上市后药物不良事件(ADE)。先前的研究使用自然语言处理(NLP)工具从相关 ADE 中提取非结构化文本。然而,缺乏上下文的文本会降低此类算法的效率。我们的目标是开发和验证一种创新的 NLP 工具——Tarantula,该工具使用上下文感知机器学习算法,通过聚合词汇表从社交媒体中检测现有的 ADE。
Tarantula 使用 FastText 嵌入和聚合词汇表从三个患者论坛(即 MedHelp、MedsChat 和 PatientInfo)中提取与华法林相关的上下文数据。该词汇表使用华法林包装说明书以及 UMLS 和 FAERS 数据库中的华法林 ADE 同义词。数据存储在 SQLite 中,然后由三名临床药师进行精炼和手动检查以进行验证。
报告的最常见 ADE 发生在 1.50%的多个器官系统中,其次是 CNS 副作用(1.19%)。报告的最不常见的淋巴系统 ADE 为 0.09%。论坛中的患者报告数据与 FAERS 之间的总体 Spearman 秩相关系数为 0.19。经药师验证,Tarantula 的灵敏度为 84.2%,特异性为 98%。三名临床药师手动验证了我们的结果。最后,我们创建了一个聚合词汇表,用于从社交媒体中挖掘 ADE。
我们成功开发了 Tarantula,这是一种基于人工智能的机器学习算法,可自动从在线社交讨论论坛中提取与华法林相关的 ADE。我们的研究表明,使用 Tarantula 检测 ADE 是可行的。未来的研究人员可以在更广泛的数据集上验证 Tarantula。