Division of Gastroenterology and Hepatology, Department of Medicine, Mayo Clinic, Phoenix, Arizona, USA.
Department of Medicine, University of California, San Diego, La Jolla, California, USA.
Clin Pharmacol Ther. 2024 Jun;115(6):1391-1399. doi: 10.1002/cpt.3226. Epub 2024 Mar 8.
Outpatient clinical notes are a rich source of information regarding drug safety. However, data in these notes are currently underutilized for pharmacovigilance due to methodological limitations in text mining. Large language models (LLMs) like Bidirectional Encoder Representations from Transformers (BERT) have shown progress in a range of natural language processing tasks but have not yet been evaluated on adverse event (AE) detection. We adapted a new clinical LLM, University of California - San Francisco (UCSF)-BERT, to identify serious AEs (SAEs) occurring after treatment with a non-steroid immunosuppressant for inflammatory bowel disease (IBD). We compared this model to other language models that have previously been applied to AE detection. We annotated 928 outpatient IBD notes corresponding to 928 individual patients with IBD for all SAE-associated hospitalizations occurring after treatment with a non-steroid immunosuppressant. These notes contained 703 SAEs in total, the most common of which was failure of intended efficacy. Out of eight candidate models, UCSF-BERT achieved the highest numerical performance on identifying drug-SAE pairs from this corpus (accuracy 88-92%, macro F1 61-68%), with 5-10% greater accuracy than previously published models. UCSF-BERT was significantly superior at identifying hospitalization events emergent to medication use (P < 0.01). LLMs like UCSF-BERT achieve numerically superior accuracy on the challenging task of SAE detection from clinical notes compared with prior methods. Future work is needed to adapt this methodology to improve model performance and evaluation using multicenter data and newer architectures like Generative pre-trained transformer (GPT). Our findings support the potential value of using large language models to enhance pharmacovigilance.
门诊临床记录是药物安全性相关信息的丰富来源。然而,由于文本挖掘方法上的局限性,这些记录中的数据目前在药物警戒中未得到充分利用。像基于转换器的双向编码器表示(Bidirectional Encoder Representations from Transformers,BERT)这样的大型语言模型在一系列自然语言处理任务中取得了进展,但尚未在不良事件(Adverse Event,AE)检测方面进行评估。我们对一种新的临床语言模型——加利福尼亚大学旧金山分校(University of California - San Francisco,UCSF)-BERT 进行了调整,以识别非甾体免疫抑制剂治疗炎症性肠病(Inflammatory Bowel Disease,IBD)后发生的严重 AE(Serious Adverse Event,S AE)。我们将该模型与之前应用于 AE 检测的其他语言模型进行了比较。我们对 928 份门诊 IBD 记录进行了标注,这些记录对应于 928 名 IBD 个体患者,记录了非甾体免疫抑制剂治疗后与所有 SAE 相关的住院情况。这些记录中共包含 703 例 SAE,最常见的是预期疗效失败。在 8 个候选模型中,UCSF-BERT 在从该语料库中识别药物-AE 对方面取得了最高的数值性能(准确率为 88-92%,宏 F1 为 61-68%),比之前发表的模型准确率高出 5-10%。UCSF-BERT 在识别与药物使用相关的住院事件方面具有显著优势(P<0.01)。与之前的方法相比,像 UCSF-BERT 这样的大型语言模型在 AE 检测这一具有挑战性的任务上取得了更高的数值准确性。未来的工作需要适应这种方法,以使用多中心数据和新的架构(如生成式预训练转换器(Generative pre-trained transformer,GPT))来提高模型性能和评估。我们的研究结果支持使用大型语言模型来增强药物警戒的潜在价值。