Suppr超能文献

使用大语言模型从头生成结直肠癌患者教育材料:提示工程是提高可读性的关键。

De novo generation of colorectal patient educational materials using large language models: Prompt engineering key to improved readability.

作者信息

Ellison India E, Oslock Wendelyn M, Abdullah Abiha, Wood Lauren, Thirumalai Mohanraj, English Nathan, Jones Bayley A, Hollis Robert, Rubyan Michael, Chu Daniel I

机构信息

Department of Surgery, University of Alabama at Birmingham, AL.

Department of Surgery, University of Alabama at Birmingham, AL; Department of Quality, Birmingham Veterans Affairs Medical Center, AL. Electronic address: https://www.twitter.com/WendelynOslock.

出版信息

Surgery. 2025 Apr;180:109024. doi: 10.1016/j.surg.2024.109024. Epub 2025 Jan 4.

Abstract

BACKGROUND

Improving patient education has been shown to improve clinical outcomes and reduce disparities, though such efforts can be labor intensive. Large language models may serve as an accessible method to improve patient educational material. The aim of this study was to compare readability between existing educational materials and those generated by large language models.

METHODS

Baseline colorectal surgery educational materials were gathered from a large academic institution (n = 52). Three prompts were entered into Perplexity and ChatGPT 3.5 for each topic: a Basic prompt that simply requested patient educational information the topic, an Iterative prompt that repeated instruction asking for the information to be more health literate, and a Metric-based prompt that requested a sixth-grade reading level, short sentences, and short words. Flesch-Kincaid Grade Level or Grade Level, Flesch-Kincaid Reading Ease or Ease, and Modified Grade Level scores were calculated for all materials, and unpaired t tests were used to compare mean scores between baseline and documents generated by artificial intelligence platforms.

RESULTS

Overall existing materials were longer than materials generated by the large language models across categories and prompts: 863-956 words vs 170-265 (ChatGPT) and 220-313 (Perplexity), all P < .01. Baseline materials did not meet sixth-grade readability guidelines based on grade level (Grade Level 7.0-9.8 and Modified Grade Level 9.6-11.5) or ease of readability (Ease 53.1-65.0). Readability of materials generated by a large language model varied by prompt and platform. Overall, ChatGPT materials were more readable than baseline materials with the Metric-based prompt: Grade Level 5.2 vs 8.1, Modified Grade Level 7.3 vs 10.3, and Ease 70.5 vs 60.4, all P < .01. In contrast, Perplexity-generated materials were significantly less readable except for those generated with the Metric-based prompt, which did not statistically differ.

CONCLUSION

Both existing materials and the majority of educational materials created by large language models did not meet readability recommendations. The exception to this was with ChatGPT materials generated with a Metric-based prompt that consistently improved readability scores from baseline and met recommendations in terms of the average Grade Level score. The variability in performance highlights the importance of the prompt used with large language models.

摘要

背景

改善患者教育已被证明可改善临床结果并减少差异,尽管这些努力可能需要大量人力。大语言模型可能是一种易于获取的方法来改进患者教育材料。本研究的目的是比较现有教育材料与大语言模型生成的材料之间的可读性。

方法

从一家大型学术机构收集了基线结直肠手术教育材料(n = 52)。针对每个主题,在Perplexity和ChatGPT 3.5中输入三个提示:一个基本提示,简单地要求提供有关该主题的患者教育信息;一个迭代提示,重复指令要求信息更具健康素养;一个基于指标的提示,要求达到六年级阅读水平、短句和短词。计算所有材料的弗莱施-金凯德年级水平或年级水平、弗莱施-金凯德阅读易读性或易读性以及修正年级水平分数,并使用未配对t检验比较基线材料与人工智能平台生成的文档之间的平均分数。

结果

总体而言,现有材料在各个类别和提示下都比大语言模型生成的材料更长:863 - 956个单词,而ChatGPT生成的材料为170 - 265个单词,Perplexity生成的材料为220 - 313个单词,所有P <.01。根据年级水平(年级水平7.0 - 9.8和修正年级水平9.6 - 11.5)或易读性(易读性53.1 - 65.0),基线材料未达到六年级可读性指南。大语言模型生成的材料的可读性因提示和平台而异。总体而言,ChatGPT生成的基于指标提示的材料比基线材料更具可读性:年级水平5.2对8.1,修正年级水平7.3对10.3,易读性70.5对60.4,所有P <.01。相比之下,Perplexity生成的材料除基于指标提示生成的材料外,可读性明显较低,而基于指标提示生成的材料在统计学上无差异。

结论

现有材料和大语言模型创建的大多数教育材料均未达到可读性建议。例外情况是ChatGPT生成 的基于指标提示的材料,其始终提高了基线的可读性分数,并在平均年级水平分数方面达到了建议。性能的变异性凸显了与大语言模型一起使用的提示的重要性。

相似文献

3
Can Artificial Intelligence Improve the Readability of Patient Education Materials?人工智能能否提高患者教育材料的可读性?
Clin Orthop Relat Res. 2023 Nov 1;481(11):2260-2267. doi: 10.1097/CORR.0000000000002668. Epub 2023 Apr 28.
7

本文引用的文献

9
10
Using ChatGPT for language editing in scientific articles.在科学文章中使用ChatGPT进行语言编辑。
Maxillofac Plast Reconstr Surg. 2023 Mar 8;45(1):13. doi: 10.1186/s40902-023-00381-x.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验