• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study.评估大语言模型在根据阅读水平生成皮肤科患者教育材料方面的应用:定性研究。
JMIR Dermatol. 2024 May 16;7:e55898. doi: 10.2196/55898.
2
Large language models: a new frontier in paediatric cataract patient education.大语言模型:小儿白内障患者教育的新前沿。
Br J Ophthalmol. 2024 Sep 20;108(10):1470-1476. doi: 10.1136/bjo-2024-325252.
3
Large language models and bariatric surgery patient education: a comparative readability analysis of GPT-3.5, GPT-4, Bard, and online institutional resources.大型语言模型和减重手术患者教育:GPT-3.5、GPT-4、Bard 与在线机构资源的可读性比较分析。
Surg Endosc. 2024 May;38(5):2522-2532. doi: 10.1007/s00464-024-10720-2. Epub 2024 Mar 12.
4
Using Large Language Models to Generate Educational Materials on Childhood Glaucoma.利用大语言模型生成儿童青光眼教育材料。
Am J Ophthalmol. 2024 Sep;265:28-38. doi: 10.1016/j.ajo.2024.04.004. Epub 2024 Apr 16.
5
Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models.推进特发性颅内压增高患者的教育:大语言模型的前景
Neurol Clin Pract. 2025 Feb;15(1):e200366. doi: 10.1212/CPJ.0000000000200366. Epub 2024 Oct 8.
6
Enhancing Health Literacy: Evaluating the Readability of Patient Handouts Revised by ChatGPT's Large Language Model.提高健康素养:评估经 ChatGPT 大型语言模型修订的患者手册的可读性。
Otolaryngol Head Neck Surg. 2024 Dec;171(6):1751-1757. doi: 10.1002/ohn.927. Epub 2024 Aug 6.
7
Can Artificial Intelligence Improve the Readability of Patient Education Materials on Aortic Stenosis? A Pilot Study.人工智能能否提高主动脉瓣狭窄患者教育材料的可读性?一项试点研究。
Cardiol Ther. 2024 Mar;13(1):137-147. doi: 10.1007/s40119-023-00347-0. Epub 2024 Jan 9.
8
Artificial Intelligence-Generated Patient Education Materials for Helicobacter pylori Infection: A Comparative Analysis.人工智能生成的幽门螺杆菌感染患者教育材料:比较分析。
Helicobacter. 2024 Jul-Aug;29(4):e13115. doi: 10.1111/hel.13115.
9
Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study.生成式语言模型在个性化医疗信息方面的评估:工具验证研究
JMIR AI. 2024 Aug 13;3:e54371. doi: 10.2196/54371.
10
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.利用生成式人工智能辅助学习罕见且复杂的诊断:对流行的大型语言模型的定性研究。
JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.

引用本文的文献

1
Does ChatGPT help patients access reliable and comprehensive information about psoriasis?ChatGPT能否帮助患者获取有关银屑病的可靠且全面的信息?
Proc (Bayl Univ Med Cent). 2025 Jun 20;38(5):658-661. doi: 10.1080/08998280.2025.2518854. eCollection 2025.
2
Using large language models to generate child-friendly education materials on myopia.使用大语言模型生成适合儿童的近视教育材料。
Digit Health. 2025 Jul 30;11:20552076251362338. doi: 10.1177/20552076251362338. eCollection 2025 Jan-Dec.
3
Development and evaluation of an agentic LLM based RAG framework for evidence-based patient education.基于具身语言模型的检索增强生成框架用于循证患者教育的开发与评估
BMJ Health Care Inform. 2025 Jul 25;32(1):e101570. doi: 10.1136/bmjhci-2025-101570.
4
Co-Design of a Health Screening Program Fact Sheet by People Experiencing Homelessness and ChatGPT: Focus Group Study.无家可归者与ChatGPT共同设计健康筛查项目情况说明书:焦点小组研究
JMIR Form Res. 2025 Jul 4;9:e68316. doi: 10.2196/68316.
5
The Role of ChatGPT in Dermatology Diagnostics.ChatGPT在皮肤病诊断中的作用。
Diagnostics (Basel). 2025 Jun 16;15(12):1529. doi: 10.3390/diagnostics15121529.
6
Enhancing patient-centered information on implant dentistry through prompt engineering: a comparison of four large language models.通过提示工程增强种植牙科以患者为中心的信息:四种大语言模型的比较
Front Oral Health. 2025 Apr 7;6:1566221. doi: 10.3389/froh.2025.1566221. eCollection 2025.
7
Risks and benefits of ChatGPT in informing patients and families with rare kidney diseases: an explorative assessment by the European Rare Kidney Disease Reference Network (ERKNet).ChatGPT在为患有罕见肾病的患者及其家属提供信息方面的风险与益处:欧洲罕见肾病参考网络(ERKNet)的探索性评估
Pediatr Nephrol. 2025 Apr 16. doi: 10.1007/s00467-025-06746-w.
8
People Living with Chronic Pain Experience a High Prevalence of Decision Regret in Canada: A Pan-Canadian Online Survey.加拿大慢性疼痛患者中决策后悔的发生率很高:一项全加拿大在线调查。
Med Decis Making. 2025 May;45(4):462-479. doi: 10.1177/0272989X251326069. Epub 2025 Mar 22.
9
The use of large language models to enhance cancer clinical trial educational materials.使用大语言模型来增强癌症临床试验教育材料。
JNCI Cancer Spectr. 2025 Mar 3;9(2). doi: 10.1093/jncics/pkaf021.
10
The Ability of Large Language Models to Generate Patient Information Materials for Retinopathy of Prematurity: Evaluation of Readability, Accuracy, and Comprehensiveness.大语言模型生成早产儿视网膜病变患者信息材料的能力:可读性、准确性和全面性评估
Turk J Ophthalmol. 2024 Dec 31;54(6):330-336. doi: 10.4274/tjo.galenos.2024.58295.

本文引用的文献

1
Almanac - Retrieval-Augmented Language Models for Clinical Medicine.用于临床医学的年鉴检索增强语言模型。
NEJM AI. 2024 Feb;1(2). doi: 10.1056/aioa2300068. Epub 2024 Jan 25.
2
Evaluation of ChatGPT Dermatology Responses to Common Patient Queries.对ChatGPT针对常见患者问题的皮肤科回应的评估。
JMIR Dermatol. 2023 Nov 17;6:e49280. doi: 10.2196/49280.
3
Assessing the Accuracy and Comprehensiveness of ChatGPT in Offering Clinical Guidance for Atopic Dermatitis and Acne Vulgaris.评估ChatGPT在提供特应性皮炎和寻常痤疮临床指导方面的准确性和全面性。
JMIR Dermatol. 2023 Nov 14;6:e50409. doi: 10.2196/50409.
4
ChatGPT for healthcare providers and patients: Practical implications within dermatology.面向医疗服务提供者和患者的ChatGPT:皮肤病学领域的实际影响
J Am Acad Dermatol. 2023 Oct;89(4):870-871. doi: 10.1016/j.jaad.2023.05.081. Epub 2023 Jun 12.
5
Prompt Engineering with ChatGPT: A Guide for Academic Writers.《ChatGPT 提示工程:学术写作者指南》
Ann Biomed Eng. 2023 Dec;51(12):2629-2633. doi: 10.1007/s10439-023-03272-4. Epub 2023 Jun 7.
6
Consulting ChatGPT: Ethical dilemmas in language model artificial intelligence.咨询ChatGPT:语言模型人工智能中的伦理困境。
J Am Acad Dermatol. 2024 Apr;90(4):879-880. doi: 10.1016/j.jaad.2023.02.052. Epub 2023 Mar 11.
7
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试(USMLE)中的表现如何?大语言模型对医学教育和知识评估的影响。
JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.
8
Online Health Information for Acne Keloidalis Nuchae has a Difficult Level of Readability.瘢痕疙瘩性痤疮的在线健康信息可读性较差。
J Drugs Dermatol. 2023 Feb 1;22(2):195-196. doi: 10.36849/JDD.7110.
9
An assessment of patient education resources for pemphigus vulgaris and bullous pemphigoid.寻常型天疱疮和大疱性类天疱疮患者教育资源评估。
Int J Dermatol. 2023 Jul;62(7):e407-e409. doi: 10.1111/ijd.16458. Epub 2022 Oct 17.
10
Readability of online Spanish patient education materials in dermatology.皮肤科在线西班牙语患者教育材料的可读性。
Arch Dermatol Res. 2021 Apr;313(3):201-204. doi: 10.1007/s00403-020-02036-7. Epub 2020 Feb 4.

评估大语言模型在根据阅读水平生成皮肤科患者教育材料方面的应用:定性研究。

Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study.

机构信息

Pritzker School of Medicine, University of Chicago, Chicago, IL, United States.

Section of Dermatology, University of Chicago Medical Center, Chicago, IL, United States.

出版信息

JMIR Dermatol. 2024 May 16;7:e55898. doi: 10.2196/55898.

DOI:10.2196/55898
PMID:38754096
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11140271/
Abstract

BACKGROUND

Dermatologic patient education materials (PEMs) are often written above the national average seventh- to eighth-grade reading level. ChatGPT-3.5, GPT-4, DermGPT, and DocsGPT are large language models (LLMs) that are responsive to user prompts. Our project assesses their use in generating dermatologic PEMs at specified reading levels.

OBJECTIVE

This study aims to assess the ability of select LLMs to generate PEMs for common and rare dermatologic conditions at unspecified and specified reading levels. Further, the study aims to assess the preservation of meaning across such LLM-generated PEMs, as assessed by dermatology resident trainees.

METHODS

The Flesch-Kincaid reading level (FKRL) of current American Academy of Dermatology PEMs was evaluated for 4 common (atopic dermatitis, acne vulgaris, psoriasis, and herpes zoster) and 4 rare (epidermolysis bullosa, bullous pemphigoid, lamellar ichthyosis, and lichen planus) dermatologic conditions. We prompted ChatGPT-3.5, GPT-4, DermGPT, and DocsGPT to "Create a patient education handout about [condition] at a [FKRL]" to iteratively generate 10 PEMs per condition at unspecified fifth- and seventh-grade FKRLs, evaluated with Microsoft Word readability statistics. The preservation of meaning across LLMs was assessed by 2 dermatology resident trainees.

RESULTS

The current American Academy of Dermatology PEMs had an average (SD) FKRL of 9.35 (1.26) and 9.50 (2.3) for common and rare diseases, respectively. For common diseases, the FKRLs of LLM-produced PEMs ranged between 9.8 and 11.21 (unspecified prompt), between 4.22 and 7.43 (fifth-grade prompt), and between 5.98 and 7.28 (seventh-grade prompt). For rare diseases, the FKRLs of LLM-produced PEMs ranged between 9.85 and 11.45 (unspecified prompt), between 4.22 and 7.43 (fifth-grade prompt), and between 5.98 and 7.28 (seventh-grade prompt). At the fifth-grade reading level, GPT-4 was better at producing PEMs for both common and rare conditions than ChatGPT-3.5 (P=.001 and P=.01, respectively), DermGPT (P<.001 and P=.03, respectively), and DocsGPT (P<.001 and P=.02, respectively). At the seventh-grade reading level, no significant difference was found between ChatGPT-3.5, GPT-4, DocsGPT, or DermGPT in producing PEMs for common conditions (all P>.05); however, for rare conditions, ChatGPT-3.5 and DocsGPT outperformed GPT-4 (P=.003 and P<.001, respectively). The preservation of meaning analysis revealed that for common conditions, DermGPT ranked the highest for overall ease of reading, patient understandability, and accuracy (14.75/15, 98%); for rare conditions, handouts generated by GPT-4 ranked the highest (14.5/15, 97%).

CONCLUSIONS

GPT-4 appeared to outperform ChatGPT-3.5, DocsGPT, and DermGPT at the fifth-grade FKRL for both common and rare conditions, although both ChatGPT-3.5 and DocsGPT performed better than GPT-4 at the seventh-grade FKRL for rare conditions. LLM-produced PEMs may reliably meet seventh-grade FKRLs for select common and rare dermatologic conditions and are easy to read, understandable for patients, and mostly accurate. LLMs may play a role in enhancing health literacy and disseminating accessible, understandable PEMs in dermatology.

摘要

背景

皮肤科患者教育材料(PEM)的编写水平通常高于全国平均水平的七至八年级阅读水平。ChatGPT-3.5、GPT-4、DermGPT 和 DocsGPT 是响应用户提示的大型语言模型(LLM)。我们的项目评估了它们在指定阅读水平下生成常见和罕见皮肤病 PEM 的能力。

目的

本研究旨在评估选定的 LLM 在未指定和指定阅读水平下生成常见和罕见皮肤病 PEM 的能力。此外,研究旨在评估皮肤科住院医师受训者评估的此类 LLM 生成的 PEM 中意义的保留情况。

方法

评估了当前美国皮肤病学会 PEM 中 4 种常见(特应性皮炎、寻常痤疮、银屑病和带状疱疹)和 4 种罕见(大疱性表皮松解症、天疱疮、层状鱼鳞癣和扁平苔藓)皮肤病的 Flesch-Kincaid 阅读水平(FKRL)。我们提示 ChatGPT-3.5、GPT-4、DermGPT 和 DocsGPT“创建一个关于[疾病]的患者教育手册,阅读水平为[FKRL]”,以迭代方式为每个疾病生成 10 个未指定第五级和第七级 FKRL 的 PEM,使用 Microsoft Word 可读性统计进行评估。通过 2 名皮肤科住院医师受训者评估 LLM 之间意义的保留情况。

结果

当前美国皮肤病学会 PEM 的平均(SD)FKRL 分别为 9.35(1.26)和 9.50(2.3),用于常见和罕见疾病。对于常见疾病,LLM 生成的 PEM 的 FKRL 范围在 9.8 到 11.21(未指定提示)、4.22 到 7.43(五年级提示)和 5.98 到 7.28(七年级提示)之间。对于罕见疾病,LLM 生成的 PEM 的 FKRL 范围在 9.85 到 11.45(未指定提示)、4.22 到 7.43(五年级提示)和 5.98 到 7.28(七年级提示)之间。在五年级阅读水平下,GPT-4 在生成常见和罕见疾病的 PEM 方面均优于 ChatGPT-3.5(P=.001 和 P=.01)、DermGPT(P<.001 和 P=.03)和 DocsGPT(P<.001 和 P=.02)。在七年级阅读水平下,ChatGPT-3.5、GPT-4、DocsGPT 或 DermGPT 之间在生成常见疾病的 PEM 方面没有显著差异(均 P>.05);然而,对于罕见疾病,ChatGPT-3.5 和 DocsGPT 的表现优于 GPT-4(P=.003 和 P<.001)。意义保留分析显示,对于常见疾病,总体阅读舒适度、患者理解度和准确性方面,DermGPT 排名最高(14.75/15,98%);对于罕见疾病,GPT-4 生成的 PEM 排名最高(14.5/15,97%)。

结论

在常见和罕见疾病的五年级 FKRL 方面,GPT-4 似乎优于 ChatGPT-3.5、DocsGPT 和 DermGPT,尽管 ChatGPT-3.5 和 DocsGPT 在罕见疾病的七年级 FKRL 方面的表现均优于 GPT-4。LLM 生成的 PEM 可能可靠地满足常见和罕见皮肤病的第七级 FKRL 要求,并且易于阅读、患者易于理解,并且大部分内容准确。LLM 可以在皮肤科中发挥作用,提高健康素养并传播易懂的 PEM。