人工智能生成与医生生成的早期糖尿病肾病患者教育材料的比较

Comparison of artificial intelligence-generated and physician-generated patient education materials on early diabetic kidney disease.

作者信息

Cheng Miaomiao, Zhang Qi, Liang Hua, Wang Yanan, Qin Jun, Gong Lei, Wang Sha, Li Luyao, Xiao Xiaoyan

机构信息

Qilu Hospital of Shandong University, Department of Nephrology, Jinan, Shandong, China.

Healthcare Big Data Research Institute, Cheeloo College of Medicine, Shandong University, Jinan, Shandong, China.

出版信息

Front Endocrinol (Lausanne). 2025 Apr 22;16:1559265. doi: 10.3389/fendo.2025.1559265. eCollection 2025.

DOI:10.3389/fendo.2025.1559265

PMID:40331140

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12052532/

Abstract

BACKGROUND

Diabetic kidney disease (DKD) is a common and serious complication of diabetes mellitus and has become the most important cause of end-stage renal disease (ESRD). In light of the rising prevalence of diabetes, there is a growing imperative for the early detection and intervention of DKD. With the rapid development of artificial intelligence (AI) technologies, its potential applications in patient education are receiving increasing attention, especially large language models (LLMs). The aim of this study was to evaluate the quality of LLMs-generated patient education materials (PEMs) for early DKD and to explore its feasibility in patient education.

METHODS

Four LLMs (ERNIE Bot 4.0, GPT-4o, ChatGLM4, and ChatGPT-o1) were selected for this study to generate PEMs. Among them, ERNIE Bot 4.0, GPT-4o, and ChatGLM4 generated 2 versions of PEMs based on American Diabetes Association(ADA) guidelines and without ADA guidelines, respectively. ChatGPT-o1 only generated a PEM without ADA guidelines. An experienced physician wrote a PEM based on ADA guidelines. All materials were assessed using a Likert scale which covered the dimensions of accuracy, completeness, safety, and patient comprehensibility. A total of 7 medical experts (including nephrologists and endocrinologists) and 50 diabetic patients were invited to evaluate the study. We recorded basic information on the patient evaluators.

RESULTS

Experts evaluated PEMs from ERNIE Bot 4.0, GPT-4o, ChatGLM4, and ChatGPT-o1, plus physician-sourced PEM. Results showed ERNIE Bot 4.0's non-guideline PEM and physician-sourced PEM were the top two. Patient assessments of the 2 top-scoring PEMs found that the ERNIE Bot 4.0's non-guideline PEM performed as well as, if not slightly better than, the physician-sourced PEM in terms of patient comprehensibility, completeness, and safety. In addition, the non-guideline-based PEM was preferred for patients with a history of diabetes longer than 5 years and for patients with proteinuria. Surprisingly, GPT-4o and ChatGLM4's non-guideline PEMs outperformed guideline-based ones.

CONCLUSION

The LLMs-sourced PEMs, especially the ERNIE Bot 4.0's non-guideline PEM for early DKD, performed comparably to the physician-sourced PEM in terms of accuracy, completeness, safety, and patient comprehensibility, and exerted a high degree of feasibility. AI may show the potential for broader applications in patient education in the near future.

摘要

背景

糖尿病肾病（DKD）是糖尿病常见且严重的并发症，已成为终末期肾病（ESRD）的最重要原因。鉴于糖尿病患病率不断上升，早期发现和干预DKD的紧迫性日益增加。随着人工智能（AI）技术的快速发展，其在患者教育中的潜在应用受到越来越多的关注，尤其是大语言模型（LLMs）。本研究的目的是评估大语言模型生成的早期DKD患者教育材料（PEMs）的质量，并探讨其在患者教育中的可行性。

方法

本研究选择了四个大语言模型（文心一言4.0、GPT-4o、ChatGLM4和ChatGPT-o1）来生成PEMs。其中，文心一言4.0、GPT-4o和ChatGLM4分别根据美国糖尿病协会（ADA）指南和不参考ADA指南生成了两个版本的PEMs。ChatGPT-o1仅生成了一个不参考ADA指南的PEM。一位经验丰富的医生根据ADA指南撰写了一份PEM。所有材料均使用李克特量表进行评估，该量表涵盖准确性、完整性、安全性和患者可理解性等维度。共邀请了7名医学专家（包括肾病学家和内分泌学家）和50名糖尿病患者参与评估。我们记录了患者评估者的基本信息。

结果

专家们评估了来自文心一言4.0、GPT-4o、ChatGLM4和ChatGPT-o1的PEMs，以及医生提供来源的PEM。结果显示文心一言4.0的非指南PEM和医生提供来源的PEM位列前两名。对得分最高的两份PEM进行患者评估发现，就患者可理解性、完整性和安全性而言，文心一言4.0的非指南PEM即便不比医生提供来源的PEM略胜一筹，也表现相当。此外，对于糖尿病病史超过5年的患者和有蛋白尿的患者，更倾向于选择非基于指南的PEM。令人惊讶的是，GPT-4o和ChatGLM4的非指南PEM表现优于基于指南的PEM。

结论

大语言模型生成来源的PEMs，尤其是文心一言4.0针对早期DKD的非指南PEM，在准确性、完整性、安全性和患者可理解性方面与医生提供来源的PEM表现相当，且具有高度可行性。人工智能在不久的将来可能在患者教育中展现出更广泛应用的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dedd/12052532/93d7b57da2fd/fendo-16-1559265-g001.jpg

相似文献

Comparison of artificial intelligence-generated and physician-generated patient education materials on early diabetic kidney disease.

Front Endocrinol (Lausanne). 2025 Apr 22;16:1559265. doi: 10.3389/fendo.2025.1559265. eCollection 2025.

Artificial Intelligence-Generated Patient Education Materials for Helicobacter pylori Infection: A Comparative Analysis.

Helicobacter. 2024 Jul-Aug;29(4):e13115. doi: 10.1111/hel.13115.

Physician Versus Large Language Model Chatbot Responses to Web-Based Questions From Autistic Patients in Chinese: Cross-Sectional Comparative Analysis.

J Med Internet Res. 2024 Apr 30;26:e54706. doi: 10.2196/54706.

Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study.

JMIR Dermatol. 2024 May 16;7:e55898. doi: 10.2196/55898.

Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study.

J Med Internet Res. 2025 Apr 10;27:e67883. doi: 10.2196/67883.

The performance of ChatGPT and ERNIE Bot in surgical resident examinations.

Int J Med Inform. 2025 Aug;200:105906. doi: 10.1016/j.ijmedinf.2025.105906. Epub 2025 Apr 4.

Comparing the performance of ChatGPT and ERNIE Bot in answering questions regarding liver cancer interventional radiology in Chinese and English contexts: A comparative study.

Digit Health. 2025 Jan 23;11:20552076251315511. doi: 10.1177/20552076251315511. eCollection 2025 Jan-Dec.

Do people prefer AI-generated patient educational materials over traditional ones?

Patient Educ Couns. 2025 May;134:108672. doi: 10.1016/j.pec.2025.108672. Epub 2025 Jan 20.

Comparing Artificial Intelligence-Generated and Clinician-Created Personalized Self-Management Guidance for Patients With Knee Osteoarthritis: Blinded Observational Study.

J Med Internet Res. 2025 May 7;27:e67830. doi: 10.2196/67830.

Application value of generative artificial intelligence in the field of stomatology.

Hua Xi Kou Qiang Yi Xue Za Zhi. 2024 Dec 1;42(6):810-815. doi: 10.7518/hxkq.2024.2024144.

本文引用的文献

Worldwide trends in diabetes prevalence and treatment from 1990 to 2022: a pooled analysis of 1108 population-representative studies with 141 million participants.

Lancet. 2024 Nov 23;404(10467):2077-2093. doi: 10.1016/S0140-6736(24)02317-1. Epub 2024 Nov 13.

Large language models in patient education: a scoping review of applications in medicine.

Front Med (Lausanne). 2024 Oct 29;11:1477898. doi: 10.3389/fmed.2024.1477898. eCollection 2024.

Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard.

BMJ Open Ophthalmol. 2024 Oct 17;9(1):e001824. doi: 10.1136/bmjophth-2024-001824.

The Potential of Large Language Model-Based Chatbot Solutions for Supplementary Counseling in Gestational Diabetes Care.

J Diabetes Sci Technol. 2024 Sep;18(5):1247-1248. doi: 10.1177/19322968241265882. Epub 2024 Jul 23.

Integrated image-based deep learning and language models for primary diabetes care.

Nat Med. 2024 Oct;30(10):2886-2896. doi: 10.1038/s41591-024-03139-8. Epub 2024 Jul 19.

The Role of Artificial Intelligence in Medical Education: A Systematic Review.

Surg Innov. 2024 Aug;31(4):415-423. doi: 10.1177/15533506241248239. Epub 2024 Apr 17.

Accuracy and Readability of Kidney Stone Patient Information Materials Generated by a Large Language Model Compared to Official Urologic Organizations.

Urology. 2024 Apr;186:107-113. doi: 10.1016/j.urology.2023.11.042. Epub 2024 Feb 21.

Structured information extraction from scientific text with large language models.

Nat Commun. 2024 Feb 15;15(1):1418. doi: 10.1038/s41467-024-45563-x.

Large Language Models in Medicine: The Potentials and Pitfalls : A Narrative Review.

Ann Intern Med. 2024 Feb;177(2):210-220. doi: 10.7326/M23-2772. Epub 2024 Jan 30.

A study of generative large language model for medical research and healthcare.

NPJ Digit Med. 2023 Nov 16;6(1):210. doi: 10.1038/s41746-023-00958-w.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

人工智能生成与医生生成的早期糖尿病肾病患者教育材料的比较

Comparison of artificial intelligence-generated and physician-generated patient education materials on early diabetic kidney disease.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献