Reis Zilma Silveira Nogueira, Pagano Adriana Silvina, Ramos de Oliveira Isaias Jose, Dias Cristiane Dos Santos, Lage Eura Martins, Mineiro Erico Franco, Varella Pereira Glaucia Miranda, de Carvalho Gomes Igor, Basilio Vinicius Araujo, Cruz-Correia Ricardo João, de Jesus Davi Dos Reis, de Souza Júnior Antônio Pereira, da Rocha Leonardo Chaves Dutra
Health Informatics Center, Faculty of Medicine, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
Arts Faculty, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.
Mayo Clin Proc Digit Health. 2024 Dec;2(4):632-644. doi: 10.1016/j.mcpdig.2024.09.006.
To assess the support of large language models (LLMs) in generating clearer and more personalized medication instructions to enhance e-prescription.
We established patient-centered guidelines for adequate, acceptable, and personalized directions to enhance e-prescription. A dataset comprising 104 outpatient scenarios, with an array of medications, administration routes, and patient conditions, was developed following the Brazilian national e-prescribing standard. Three prompts were submitted to a closed-source LLM. The first prompt involved a generic command, the second one was calibrated for content enhancement and personalization, and the third one requested bias mitigation. The third prompt was submitted to an open-source LLM. Outputs were assessed using automated metrics and human evaluation. We conducted the study between March 1, 2024 and September 10, 2024.
Adequacy scores of our closed-source LLM's output showed the third prompt outperforming the first and second one. Full and partial acceptability was achieved in 94.3% of texts with the third prompt. Personalization was rated highly, especially with the second and third prompts. The 2 LLMs showed similar adequacy results. Lack of scientific evidence and factual errors were infrequent and unrelated to a particular prompt or LLM. The frequency of hallucinations was different for each LLM and concerned prescriptions issued upon symptom manifestation and medications requiring dosage adjustment or involving intermittent use. Gender bias was found in our closed-source LLM's output for the first and second prompts, with the third one being bias-free. The second LLM's output was bias-free.
This study demonstrates the potential of LLM-supported generation to produce prescription directions and improve communication between health professionals and patients within the e-prescribing system.
评估大语言模型(LLMs)在生成更清晰、更个性化的用药说明以增强电子处方方面的支持作用。
我们制定了以患者为中心的指南,以提供充分、可接受且个性化的用药说明来增强电子处方。按照巴西国家电子处方标准,开发了一个包含104个门诊场景的数据集,涵盖一系列药物、给药途径和患者病情。向一个闭源大语言模型提交了三个提示。第一个提示是一个通用指令,第二个提示针对内容增强和个性化进行了校准,第三个提示要求减轻偏差。第三个提示被提交给一个开源大语言模型。使用自动指标和人工评估对输出结果进行评估。我们在2024年3月1日至2024年9月10日期间进行了这项研究。
我们闭源大语言模型输出的充分性得分显示,第三个提示优于第一个和第二个提示。使用第三个提示时,94.3%的文本实现了完全和部分可接受性。个性化程度得到高度评价,尤其是第二个和第三个提示。两个大语言模型显示出相似的充分性结果。缺乏科学证据和事实错误很少见,且与特定提示或大语言模型无关。每个大语言模型的幻觉频率不同,涉及症状出现时开具的处方以及需要调整剂量或涉及间歇性使用的药物。在我们闭源大语言模型对第一个和第二个提示的输出中发现了性别偏差,第三个提示无偏差。第二个大语言模型的输出无偏差。
本研究证明了大语言模型支持生成在电子处方系统中产生处方说明并改善医护人员与患者之间沟通的潜力。