Cha Hwa Jun, Choe Kyuyeon, Shin Euibeom, Ramanathan Murali, Han Sungpil
Department of Pharmacology, College of Medicine, The Catholic University of Korea, 222 Banpo-daero, Seocho-Gu, Seoul, 06591, Korea.
PIPET (Pharmacometrics Institute for Practical Education and Training), College of Medicine, The Catholic University of Korea, Seoul, 06591, Republic of Korea.
J Pharmacokinet Pharmacodyn. 2025 Jun 4;52(3):34. doi: 10.1007/s10928-025-09982-7.
Advancements in large language models (LLMs) have suggested their potential utility for diverse pharmacometrics tasks. This study investigated the performance of LLM for generating structure diagrams, publication-ready tables, analysis reports, and conducting simulations using output files from pharmacometrics models. Forty-four NONMEM output files were obtained from the GitHub software repository. The performance of Claude 3.5 Sonnet (Claude) and ChatGPT 4o was compared with two other candidate LLMs: Gemini 1.5 Pro and Llama 3.2. Prompt engineering was conducted for Claude for pharmacometrics tasks such as generating model structure diagrams, parameter tables, and analysis reports. Simulations were conducted using ChatGPT. Claude Artifacts was used to visualize model structure diagrams, parameter tables, and analysis reports. A web-based R Shiny application was implemented to provide an accessible interface for automating pharmacometric model structure diagrams, parameter tables, and analysis reports tasks. Claude was selected for investigation following performance comparisons with ChatGPT 4o, Gemini 1.5 Pro, and Llama on model structure diagram and parameter table generation tasks. Claude successfully generated the model structure diagrams for 40 (90.9%) of the 44 NONMEM output files with the initial prompts, and the remaining were resolved with an additional prompt. Claude consistently generated accurate parameter summary tables and succinct model analysis reports. Modest variability in model structure diagrams generated for replicate prompts was identified. ChatGPT demonstrated simulation capabilities but revealed limitations with complex PK/PD models. LLMs have the potential to enhance key pharmacometrics modeling tasks. However, expert review of the results generated is essential.
大语言模型(LLMs)的进展表明它们在各种药代动力学任务中具有潜在的实用价值。本研究调查了大语言模型在使用药代动力学模型的输出文件生成结构图、可用于发表的表格、分析报告以及进行模拟方面的性能。从GitHub软件库中获取了44个NONMEM输出文件。将Claude 3.5 Sonnet(Claude)和ChatGPT 4o的性能与另外两个候选大语言模型Gemini 1.5 Pro和Llama 3.2进行了比较。针对Claude进行了提示工程,以用于药代动力学任务,如生成模型结构图、参数表和分析报告。使用ChatGPT进行了模拟。Claude Artifacts用于可视化模型结构图、参数表和分析报告。实现了一个基于网络的R Shiny应用程序,为自动化药代动力学模型结构图、参数表和分析报告任务提供一个可访问的接口。在与ChatGPT 4o、Gemini 1.5 Pro和Llama在模型结构图和参数表生成任务上进行性能比较后,选择Claude进行研究。Claude使用初始提示成功为44个NONMEM输出文件中的40个(90.9%)生成了模型结构图,其余的通过额外提示得到解决。Claude始终生成准确的参数汇总表和简洁的模型分析报告。发现为重复提示生成的模型结构图存在一定程度的变异性。ChatGPT展示了模拟能力,但在复杂的PK/PD模型方面存在局限性。大语言模型有潜力增强关键的药代动力学建模任务。然而,对生成的结果进行专家审查至关重要。