Sridharan Kannan, Sivaramakrishnan Gowri
Pharmacology & Therapeutics, College of Medicine & Health Sciences, Arabian Gulf University, Manama, Manama, Bahrain
Bahrain Defence Force Royal Medical Services, Riffa, Bahrain.
Eur J Hosp Pharm. 2024 Dec 30. doi: 10.1136/ejhpharm-2024-004245.
Large language models (LLMs) with advanced language generation capabilities have the potential to enhance patient interactions. This study evaluates the effectiveness of ChatGPT 4.0 and Gemini 1.0 Pro in providing patient instructions and creating patient educational material (PEM).
A cross-sectional study employed ChatGPT 4.0 and Gemini 1.0 Pro across six medical scenarios using simple and detailed prompts. The Patient Education Materials Assessment Tool for Print materials (PEMAT-P) evaluated the understandability, actionability, and readability of the outputs.
LLMs provided consistent responses, especially regarding drug information, therapeutic goals, administration, common side effects, and interactions. However, they lacked guidance on expiration dates and proper medication disposal. Detailed prompts yielded comprehensible outputs for the average adult. ChatGPT 4.0 had mean understandability and actionability scores of 80% and 60%, respectively, compared with 67% and 60% for Gemini 1.0 Pro. ChatGPT 4.0 produced longer outputs, achieving 85% readability with detailed prompts, while Gemini 1.0 Pro maintained consistent readability. Simple prompts resulted in ChatGPT 4.0 outputs at a 10th-grade reading level, while Gemini 1.0 Pro outputs were at a 7th-grade level. Both LLMs produced outputs at a 6th-grade level with detailed prompts.
LLMs show promise in generating patient instructions and PEM. However, healthcare professional oversight and patient education on LLM use are essential for effective implementation.
具有先进语言生成能力的大语言模型(LLMs)有潜力改善医患互动。本研究评估了ChatGPT 4.0和Gemini 1.0 Pro在提供患者指导和创建患者教育材料(PEM)方面的有效性。
一项横断面研究在六种医疗场景中使用简单和详细的提示词对ChatGPT 4.0和Gemini 1.0 Pro进行了测试。用于印刷材料的患者教育材料评估工具(PEMAT-P)评估了输出内容的可理解性、可操作性和可读性。
大语言模型给出了一致的回答,特别是在药物信息、治疗目标、给药方式、常见副作用和相互作用方面。然而,它们在有效期和正确的药物处理方面缺乏指导。详细的提示词为普通成年人产生了可理解的输出。ChatGPT 4.0的平均可理解性和可操作性得分分别为80%和60%,而Gemini 1.0 Pro的这两项得分分别为67%和60%。ChatGPT 4.0生成的输出更长,在使用详细提示词时可读性达到85%,而Gemini 1.0 Pro的可读性保持一致。简单提示词使ChatGPT 4.0的输出达到十年级阅读水平,而Gemini 1.0 Pro的输出为七年级水平。在使用详细提示词时,两个大语言模型的输出均为六年级水平。
大语言模型在生成患者指导和患者教育材料方面显示出前景。然而,医疗保健专业人员的监督以及对大语言模型使用的患者教育对于有效实施至关重要。