ChatGPT作为结直肠癌患者教育工具——疗效、质量和可读性的深入评估

ChatGPT as a patient education tool in colorectal cancer-An in-depth assessment of efficacy, quality and readability.

作者信息

Siu Adrian H Y, Gibson Damien P, Chiu Chris, Kwok Allan, Irwin Matt, Christie Adam, Koh Cherry E, Keshava Anil, Reece Mifanwy, Suen Michael, Rickard Matthew J F X

机构信息

Concord Institute of Academic Surgery (CIAS), Concord Repatriation General Hospital, Concord, New South Wales, Australia.

Faculty of Medicine and Health, Central Clinical School, The University of Sydney, Sydney, New South Wales, Australia.

出版信息

Colorectal Dis. 2025 Jan;27(1):e17267. doi: 10.1111/codi.17267. Epub 2024 Dec 17.

DOI:10.1111/codi.17267

PMID:39690137

Abstract

AIM

Artificial intelligence (AI) chatbots such as Chat Generative Pretrained Transformer-4 (ChatGPT-4) have made significant strides in generating human-like responses. Trained on an extensive corpus of medical literature, ChatGPT-4 has the potential to augment patient education materials. These chatbots may be beneficial to populations considering a diagnosis of colorectal cancer (CRC). However, the accuracy and quality of patient education materials are crucial for informed decision-making. Given workforce demands impacting holistic care, AI chatbots can bridge gaps in CRC information, reaching wider demographics and crossing language barriers. However, rigorous evaluation is essential to ensure accuracy, quality and readability. Therefore, this study aims to evaluate the efficacy, quality and readability of answers generated by ChatGPT-4 on CRC, utilizing patient-style question prompts.

METHOD

To evaluate ChatGPT-4, eight CRC-related questions were derived using peer-reviewed literature and Google Trends. Eight colorectal surgeons evaluated AI responses for accuracy, safety, appropriateness, actionability and effectiveness. Quality was assessed using validated tools: the Patient Education Materials Assessment Tool (PEMAT-AI), modified DISCERN (DISCERN-AI) and Global Quality Score (GQS). A number of readability assessments were measured including Flesch Reading Ease (FRE) and the Gunning Fog Index (GFI).

RESULTS

The responses were generally accurate (median 4.00), safe (4.25), appropriate (4.00), actionable (4.00) and effective (4.00). Quality assessments rated PEMAT-AI as 'very good' (71.43), DISCERN-AI as 'fair' (12.00) and GQS as 'high' (4.00). Readability scores indicated difficulty (FRE 47.00, GFI 12.40), suggesting a higher educational level was required.

CONCLUSION

This study concludes that ChatGPT-4 is capable of providing safe but nonspecific medical information, suggesting its potential as a patient education aid. However, enhancements in readability through contextual prompting and fine-tuning techniques are required before considering implementation into clinical practice.

摘要

目的

诸如聊天生成预训练变换器4（ChatGPT-4）之类的人工智能（AI）聊天机器人在生成类人回复方面取得了重大进展。ChatGPT-4基于大量医学文献语料库进行训练，有潜力增强患者教育材料。这些聊天机器人可能对考虑结直肠癌（CRC）诊断的人群有益。然而，患者教育材料的准确性和质量对于明智的决策至关重要。鉴于劳动力需求影响整体护理，AI聊天机器人可以弥合CRC信息方面的差距，覆盖更广泛的人群并跨越语言障碍。然而，严格的评估对于确保准确性、质量和可读性至关重要。因此，本研究旨在利用患者风格的问题提示来评估ChatGPT-4生成的关于CRC的答案的有效性、质量和可读性。

方法

为了评估ChatGPT-4，使用同行评审文献和谷歌趋势得出了八个与CRC相关的问题。八位结直肠外科医生评估了AI回复的准确性、安全性、适当性、可操作性和有效性。使用经过验证的工具评估质量：患者教育材料评估工具（PEMAT-AI）、改良的辨别工具（DISCERN-AI）和全球质量评分（GQS）。测量了一些可读性评估指标，包括弗莱什易读性指数（FRE）和冈宁雾度指数（GFI）。

结果

回复总体上准确（中位数4.00）、安全（4.25）、适当（4.00）、可操作（4.00）且有效（4.00）。质量评估将PEMAT-AI评为“非常好”（71.43），DISCERN-AI评为“一般”（12.00），GQS评为“高”（4.00）。可读性分数表明存在难度（FRE 47.00，GFI 12.40），这表明需要更高的教育水平。

结论

本研究得出结论，ChatGPT-4能够提供安全但非特定的医学信息，表明其作为患者教育辅助工具的潜力。然而，在考虑将其应用于临床实践之前，需要通过上下文提示和微调技术提高可读性。

相似文献

ChatGPT as a patient education tool in colorectal cancer-An in-depth assessment of efficacy, quality and readability.ChatGPT作为结直肠癌患者教育工具——疗效、质量和可读性的深入评估

Colorectal Dis. 2025 Jan;27(1):e17267. doi: 10.1111/codi.17267. Epub 2024 Dec 17.

Evaluating the Efficacy of ChatGPT as a Patient Education Tool in Prostate Cancer: Multimetric Assessment.评估 ChatGPT 在前列腺癌患者教育中的疗效：多指标评估。

J Med Internet Res. 2024 Aug 14;26:e55939. doi: 10.2196/55939.

Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study.评估人工智能聊天机器人提供的关于化疗心脏毒性的患者教育材料的质量和可读性：一项观察性横断面研究。

Medicine (Baltimore). 2025 Apr 11;104(15):e42135. doi: 10.1097/MD.0000000000042135.

Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study.人工智能聊天机器人在回应与创伤性牙损伤相关的患者咨询中的表现：一项比较研究。

Dent Traumatol. 2025 Jun;41(3):338-347. doi: 10.1111/edt.13020. Epub 2024 Nov 22.

Readability, quality and accuracy of generative artificial intelligence chatbots for commonly asked questions about labor epidurals: a comparison of ChatGPT and Bard.生成式人工智能聊天机器人针对分娩硬膜外麻醉常见问题的可读性、质量和准确性：ChatGPT与Bard的比较

Int J Obstet Anesth. 2025 Feb;61:104317. doi: 10.1016/j.ijoa.2024.104317. Epub 2024 Dec 20.

Appropriateness and readability of Google Bard and ChatGPT-3.5 generated responses for surgical treatment of glaucoma.谷歌巴德和 ChatGPT-3.5 生成的青光眼手术治疗回复的适宜性和可读性。

Rom J Ophthalmol. 2024 Jul-Sep;68(3):243-248. doi: 10.22336/rjo.2024.45.

Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients.生成式人工智能聊天机器人可能会为患者关于常见血管外科问题提供恰当的信息性回复。

Vascular. 2025 Feb;33(1):229-237. doi: 10.1177/17085381241240550. Epub 2024 Mar 18.

The promising role of chatbots in keratorefractive surgery patient education.聊天机器人在角膜屈光手术患者教育中的潜在作用。

J Fr Ophtalmol. 2025 Feb;48(2):104381. doi: 10.1016/j.jfo.2024.104381. Epub 2024 Dec 13.

Dr. Google to Dr. ChatGPT: assessing the content and quality of artificial intelligence-generated medical information on appendicitis.谷歌博士对 ChatGPT 博士：评估人工智能生成的关于阑尾炎的医学信息的内容和质量。

Surg Endosc. 2024 May;38(5):2887-2893. doi: 10.1007/s00464-024-10739-5. Epub 2024 Mar 5.

Assessing the Readability of Patient Education Materials on Cardiac Catheterization From Artificial Intelligence Chatbots: An Observational Cross-Sectional Study.评估人工智能聊天机器人提供的心脏导管插入术患者教育材料的可读性：一项观察性横断面研究。

Cureus. 2024 Jul 4;16(7):e63865. doi: 10.7759/cureus.63865. eCollection 2024 Jul.

引用本文的文献

Large language models' capabilities in responding to tuberculosis medical questions: testing ChatGPT, Gemini, and Copilot.大型语言模型在回答结核病医学问题方面的能力：对ChatGPT、Gemini和Copilot进行测试

Sci Rep. 2025 May 23;15(1):18004. doi: 10.1038/s41598-025-03074-9.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ChatGPT作为结直肠癌患者教育工具——疗效、质量和可读性的深入评估

ChatGPT as a patient education tool in colorectal cancer-An in-depth assessment of efficacy, quality and readability.

作者信息

机构信息

出版信息

AIM

METHOD

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献