Department of Mathematics, Computer Science and Physics, University of Udine, Udine 33100, Italy.
Università degli Studi di Napoli Federico II, Napoli 80138, Italy.
Exp Biol Med (Maywood). 2022 Nov;247(22):2003-2014. doi: 10.1177/15353702221128577. Epub 2022 Oct 31.
In the last decade, an increasing number of users have started reporting adverse drug events (ADEs) on social media platforms, blogs, and health forums. Given the large volume of reports, pharmacovigilance has focused on ways to use natural language processing (NLP) techniques to rapidly examine these large collections of text, detecting mentions of drug-related adverse reactions to trigger medical investigations. However, despite the growing interest in the task and the advances in NLP, the robustness of these models in face of linguistic phenomena such as negations and speculations is an open research question. Negations and speculations are pervasive phenomena in natural language and can severely hamper the ability of an automated system to discriminate between factual and non-factual statements in text. In this article, we take into consideration four state-of-the-art systems for ADE detection on social media texts. We introduce SNAX, a benchmark to test their performance against samples containing negated and speculated ADEs, showing their fragility against these phenomena. We then introduce two possible strategies to increase the robustness of these models, showing that both of them bring significant increases in performance, lowering the number of spurious entities predicted by the models by 60% for negation and 80% for speculations.
在过去的十年中,越来越多的用户开始在社交媒体平台、博客和健康论坛上报告药物不良事件 (ADE)。鉴于报告数量庞大,药物警戒已专注于使用自然语言处理 (NLP) 技术快速检查这些大量文本的方法,以检测与药物相关的不良反应的提及,从而触发医学调查。然而,尽管人们对这项任务越来越感兴趣,并且在 NLP 方面取得了进展,但这些模型在面对否定和推测等语言现象时的稳健性仍然是一个悬而未决的研究问题。否定和推测是自然语言中普遍存在的现象,它们会严重阻碍自动系统在文本中区分事实和非事实陈述的能力。在本文中,我们考虑了四个用于社交媒体文本中 ADE 检测的最先进系统。我们引入了 SNAX,这是一个基准测试,用于测试它们对包含否定和推测 ADE 的样本的性能,展示了它们对这些现象的脆弱性。然后,我们介绍了两种可能的策略来提高这些模型的稳健性,结果表明这两种策略都显著提高了性能,将模型预测的虚假实体数量分别降低了 60%(用于否定)和 80%(用于推测)。