大语言模型在生成患者教育材料中的应用：一项范围综述

The Use of Large Language Models in Generating Patient Education Materials: a Scoping Review.

作者信息

AlSammarraie Alhasan, Househ Mowafa

机构信息

Faculty College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar1.

出版信息

Acta Inform Med. 2025;33(1):4-10. doi: 10.5455/aim.2024.33.4-10.

DOI:10.5455/aim.2024.33.4-10

PMID:40223858

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11986337/

Abstract

BACKGROUND

Patient Education is a healthcare concept that involves educating the public with evidence-based medical information. This information surges their capabilities to promote a healthier life and better manage their conditions. LLM platforms have recently been introduced as powerful NLPs capable of producing human-sounding text and by extension patient education materials.

OBJECTIVE

This study aims to conduct a scoping review to systematically map the existing literature on the use of LLMs for generating patient education materials.

METHODS

The study followed JBI guidelines, searching five databases using set inclusion/exclusion criteria. A RAG-inspired framework was employed to extract the variables followed by a manual check to verify accuracy of extractions. In total, 21 variables were identified and grouped into five themes: Study Demographics, LLM Characteristics, Prompt-Related Variables, PEM Assessment, and Comparative Outcomes.

RESULTS

Results were reported from 69 studies. The United States contributed the largest number of studies. LLM models such as ChatGPT-4, ChatGPT-3.5, and Bard were the most investigated. Most studies evaluated the accuracy of LLM responses and the readability of LLM responses. Only 3 studies implemented external knowledge bases leveraging a RAG architecture. All studies except 3 conducted prompting in English. ChatGPT-4 was found to provide the most accurate responses in comparison with other models.

CONCLUSION

This review examined studies comparing large language models for generating patient education materials. ChatGPT-3.5 and ChatGPT-4 were the most evaluated. Accuracy and readability of responses were the main metrics of evaluation, while few studies used assessment frameworks, retrieval-augmented methods, or explored non-English cases.

摘要

背景

患者教育是一种医疗保健理念，涉及用循证医学信息对公众进行教育。这些信息增强了他们促进更健康生活和更好管理自身病情的能力。大语言模型（LLM）平台最近作为强大的自然语言处理工具被引入，能够生成类似人类的文本，进而生成患者教育材料。

目的

本研究旨在进行一项范围综述，以系统梳理关于使用大语言模型生成患者教育材料的现有文献。

方法

本研究遵循循证卫生保健国际协作网（JBI）指南，使用设定的纳入/排除标准搜索五个数据库。采用一种受检索、生成、优化（RAG）启发的框架来提取变量，随后进行人工检查以验证提取的准确性。总共识别出21个变量，并将其分为五个主题：研究人口统计学、大语言模型特征、提示相关变量、患者教育材料评估和比较结果。

结果

69项研究报告了结果。美国的研究数量最多。ChatGPT-4、ChatGPT-3.5和Bard等大语言模型是研究最多的。大多数研究评估了大语言模型回答的准确性和可读性。只有3项研究利用RAG架构实施了外部知识库。除3项研究外，所有研究均以英语进行提示。与其他模型相比，ChatGPT-4被发现能提供最准确的回答。

结论

本综述审视了比较用于生成患者教育材料的大语言模型的研究。ChatGPT-3.5和ChatGPT-4是评估最多的。回答的准确性和可读性是主要评估指标，而很少有研究使用评估框架、检索增强方法或探索非英语案例。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

大语言模型在生成患者教育材料中的应用：一项范围综述

The Use of Large Language Models in Generating Patient Education Materials: a Scoping Review.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

大语言模型在生成患者教育材料中的应用：一项范围综述

The Use of Large Language Models in Generating Patient Education Materials: a Scoping Review.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献