Paredes Daniel, Talankar Sankalp, Peng Cheng, Balian Patrick, Lewis Motomoti, Yan Shunhun, PharmD Wen-Shan Tsai, Chang Ching-Yuan, Wilson Debbie L, Lo-Ciganic Wei-Hsuan, Wu Yonghui
Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:414-421. eCollection 2025.
Opioid overdose and opioid use disorder (OUD) remain a growing public health issue in the United States, affecting 6.1 million individuals in 2022, more than doubling the 2.5 million from 2021. Accurately identifying the opioid overdose and OUD related information is critical to study the outcomes and develop interventions. This study aims to identify opioid overdose and OUD mentions and their related information from clinical narratives. We compared encoder-based large language models (LLMs) and decoder-based generative LLMs in extracting nine crucial concepts related with opioid overdose and OUD including problematic opioid use. Through a cost-effective p-tuning algorithm, our decoder-based generative LLM, GatorTronGPT, achieved the best strict/lenient F1-score of 0.8637, and 0.9057, demonstrating the efficient of using generative LLMs for opioid overdose/OUD related information extraction. This study provided a tool to systematically extract opioid overdose, OUD, and their related information to facilitate opioid-related studies using clinical narratives.
阿片类药物过量使用和阿片类药物使用障碍(OUD)在美国仍然是一个日益严重的公共卫生问题,2022年影响了610万人,比2021年的250万人增加了一倍多。准确识别与阿片类药物过量使用和OUD相关的信息对于研究结果和制定干预措施至关重要。本研究旨在从临床叙述中识别阿片类药物过量使用和OUD的提及及其相关信息。我们比较了基于编码器的大语言模型(LLMs)和解码器的生成式LLMs在提取与阿片类药物过量使用和OUD相关的九个关键概念(包括有问题的阿片类药物使用)方面的表现。通过一种经济高效的p-tuning算法,我们基于解码器的生成式LLM,即GatorTronGPT,在严格/宽松F1分数上分别达到了最佳的0.8637和0.9057,证明了使用生成式LLMs进行阿片类药物过量使用/OUD相关信息提取的有效性。本研究提供了一种工具,可系统地提取阿片类药物过量使用、OUD及其相关信息,以促进利用临床叙述进行的阿片类药物相关研究。