Jabeen Jaziya, Saji Jyothis G
Cardiology, Royal Cornwall Hospital, Cornwall, GBR.
Emergency Medicine, Jubilee Mission Medical College & Research Institute, Thrissur, IND.
Cureus. 2025 Jun 3;17(6):e85277. doi: 10.7759/cureus.85277. eCollection 2025 Jun.
Introduction Artificial intelligence (AI) chatbots, including ChatGPT and DeepSeek, are becoming popular tools for generating patient education materials for chronic diseases. AI chatbots are useful as supplements to traditional counseling but lack the empathy and intuition of healthcare professionals, making them most effective when used alongside human therapists. The objective of the study is to compare ChatGPT-4o and DeepSeek V3-generated patient educational guides for epilepsy, heart failure, chronic obstructive pulmonary disease (COPD), and chronic kidney disease (CKD). Methodology In this cross-sectional study, the standardized prompts for each disease were entered into ChatGPT and DeepSeek. The resultant texts were evaluated for readability, originality, quality, and suitability. Unpaired t-tests were performed to analyze statistical differences between tools. Results Both AI tools created patient education materials that had similar word and sentence counts, readability scores, reliability, and suitability in all areas, except for the similarity percentage, which was much higher in ChatGPT outputs (p=0.049). The readability scores indicated that both tools produced content that was above the recommended level for patient materials. Both tools resulted in high similarity indices that exceeded accepted academic thresholds. Reliability scores were moderate, and while understandability was high, actionability scores were suboptimal for both models. Conclusion The patient education materials provided by ChatGPT and DeepSeek are similar in nature, but neither satisfies recommended standards for readability, originality, or actionability. Both still need additional fine-tuning and human oversight to enhance accessibility, reliability, and practical utility in clinical settings.
引言 包括ChatGPT和豆包在内的人工智能(AI)聊天机器人正成为生成慢性病患者教育材料的流行工具。AI聊天机器人作为传统咨询的补充很有用,但缺乏医疗保健专业人员的同理心和直觉,因此与人类治疗师一起使用时效果最佳。本研究的目的是比较ChatGPT-4o和豆包V3生成的癫痫、心力衰竭、慢性阻塞性肺疾病(COPD)和慢性肾病(CKD)患者教育指南。方法 在这项横断面研究中,将每种疾病的标准化提示输入ChatGPT和豆包。对生成的文本进行可读性、原创性、质量和适用性评估。进行不成对t检验以分析工具之间的统计差异。结果 两种AI工具创建的患者教育材料在单词和句子数量、可读性分数、可靠性和所有领域的适用性方面都相似,除了相似性百分比,ChatGPT输出中的相似性百分比要高得多(p=0.049)。可读性分数表明,两种工具生成的内容都高于患者材料的推荐水平。两种工具的相似性指数都很高,超过了公认的学术阈值。可靠性分数中等,虽然易懂性高,但两种模型的可操作性分数都不理想。结论 ChatGPT和豆包提供的患者教育材料本质上相似,但都不符合可读性、原创性或可操作性的推荐标准。两者仍需要额外的微调以及人工监督,以提高在临床环境中的可及性、可靠性和实际效用。
Am J Obstet Gynecol. 2025-6-25
Cochrane Database Syst Rev. 2022-5-20
J Bone Joint Surg Am. 2025-6-19
JAMA Netw Open. 2024-7-1
BMJ Qual Saf. 2024-10-18
J Med Internet Res. 2024-7-23