比较ChatGPT和谷歌Gemini对常见肛门良性疾病问题的回答。

Comparing answers of ChatGPT and Google Gemini to common questions on benign anal conditions.

作者信息

Maron C M, Emile S H, Horesh N, Freund M R, Pellino G, Wexner S D

机构信息

Trinity College, Hartford, CT, USA.

Ellen Leifer Shulman and Steven Shulman Digestive Disease Center, Cleveland Clinic Florida, 2950 Cleveland Clinic Blvd, Weston, FL, USA.

出版信息

Tech Coloproctol. 2025 Jan 26;29(1):57. doi: 10.1007/s10151-024-03096-x.

DOI:10.1007/s10151-024-03096-x

PMID:39864043

Abstract

INTRODUCTION

Chatbots have been increasingly used as a source of patient education. This study aimed to compare the answers of ChatGPT-4 and Google Gemini to common questions on benign anal conditions in terms of appropriateness, comprehensiveness, and language level.

METHODS

Each chatbot was asked a set of 30 questions on hemorrhoidal disease, anal fissures, and anal fistulas. The responses were assessed for appropriateness, comprehensiveness, and reference provision. The assessments were made by three subject experts who were unaware of the name of the chatbots. The language level of the chatbot answers was assessed using the Flesch-Kincaid Reading Ease score and grade level.

RESULTS

Overall, the answers provided by both models were appropriate and comprehensive. The answers of Google Gemini were more appropriate, comprehensive, and supported by references compared with the answers of ChatGPT. In addition, the agreement among the assessors on the appropriateness of Google Gemini answers was higher, attesting to a higher consistency. ChatGPT had a significantly higher Flesh-Kincaid grade level than Google Gemini (12.3 versus 10.6, p = 0.015), but a similar median Flesh-Kincaid Ease score.

CONCLUSIONS

The answers of Google Gemini to questions on common benign anal conditions were more appropriate and comprehensive, and more often supported with references, than the answers of ChatGPT. The answers of both chatbots were at grade levels higher than the 6th grade level, which may be difficult for nonmedical individuals to comprehend.

摘要

引言

聊天机器人越来越多地被用作患者教育的来源。本研究旨在比较ChatGPT-4和谷歌Gemini针对常见肛门良性疾病问题给出的答案在恰当性、全面性和语言水平方面的差异。

方法

向每个聊天机器人提出一组关于痔病、肛裂和肛瘘的30个问题。对回答进行恰当性、全面性和参考文献提供情况的评估。评估由三位不了解聊天机器人名称的学科专家进行。使用弗莱什-金凯德阅读简易度得分和年级水平来评估聊天机器人答案的语言水平。

结果

总体而言，两个模型给出的答案都是恰当且全面的。与ChatGPT的答案相比，谷歌Gemini的答案更恰当、更全面且有参考文献支持。此外，评估者对谷歌Gemini答案恰当性的一致性更高，证明其一致性更强。ChatGPT的弗莱什-金凯德年级水平显著高于谷歌Gemini（12.3对10.6，p = 0.015），但中位数弗莱什-金凯德简易度得分相似。

结论

对于常见肛门良性疾病问题，谷歌Gemini给出的答案比ChatGPT的答案更恰当、更全面，且更常伴有参考文献支持。两个聊天机器人的答案年级水平都高于六年级，这可能让非医学专业人士难以理解。

相似文献

Comparing answers of ChatGPT and Google Gemini to common questions on benign anal conditions.比较ChatGPT和谷歌Gemini对常见肛门良性疾病问题的回答。

Tech Coloproctol. 2025 Jan 26;29(1):57. doi: 10.1007/s10151-024-03096-x.

Evaluación de la fiabilidad y legibilidad de las respuestas de los chatbots como recurso de información al paciente para las exploraciones PET-TC más communes.评估聊天机器人回复作为常见PET-CT检查患者信息资源的可靠性和可读性。

Rev Esp Med Nucl Imagen Mol (Engl Ed). 2025 Jan-Feb;44(1):500065. doi: 10.1016/j.remnie.2024.500065. Epub 2024 Sep 28.

Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study.人工智能聊天机器人在回应与创伤性牙损伤相关的患者咨询中的表现：一项比较研究。

Dent Traumatol. 2025 Jun;41(3):338-347. doi: 10.1111/edt.13020. Epub 2024 Nov 22.

The impact of internet resources and artificial intelligence on information on myringotomy tubes.互联网资源和人工智能对鼓膜切开术管相关信息的影响

Eur Arch Otorhinolaryngol. 2025 Apr;282(4):2149-2153. doi: 10.1007/s00405-024-09148-0. Epub 2024 Dec 12.

Readability, reliability and quality of responses generated by ChatGPT, gemini, and perplexity for the most frequently asked questions about pain.ChatGPT、Gemini和Perplexity针对最常见疼痛问题生成的回答的可读性、可靠性和质量。

Medicine (Baltimore). 2025 Mar 14;104(11):e41780. doi: 10.1097/MD.0000000000041780.

Assessing the Readability of Patient Education Materials on Cardiac Catheterization From Artificial Intelligence Chatbots: An Observational Cross-Sectional Study.评估人工智能聊天机器人提供的心脏导管插入术患者教育材料的可读性：一项观察性横断面研究。

Cureus. 2024 Jul 4;16(7):e63865. doi: 10.7759/cureus.63865. eCollection 2024 Jul.

Unlocking Health Literacy: The Ultimate Guide to Hypertension Education From ChatGPT Versus Google Gemini.解锁健康素养：ChatGPT与谷歌Gemini高血压教育终极指南

Cureus. 2024 May 8;16(5):e59898. doi: 10.7759/cureus.59898. eCollection 2024 May.

Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery.谷歌医生与ChatGPT医生：通过比较关于白内障及白内障手术的常见患者问题的回答的准确性、安全性和可读性，探索人工智能在眼科领域的应用。

Semin Ophthalmol. 2024 Aug;39(6):472-479. doi: 10.1080/08820538.2024.2326058. Epub 2024 Mar 22.

The use of ChatGPT and Google Gemini in responding to orthognathic surgery-related questions: A comparative study.ChatGPT与谷歌Gemini在回答正颌外科相关问题中的应用：一项比较研究。

J World Fed Orthod. 2025 Feb;14(1):20-26. doi: 10.1016/j.ejwf.2024.09.004. Epub 2024 Oct 28.

Evaluation of Responses to Questions About Keratoconus Using ChatGPT-4.0, Google Gemini and Microsoft Copilot: A Comparative Study of Large Language Models on Keratoconus.使用ChatGPT-4.0、谷歌Gemini和微软Copilot评估圆锥角膜相关问题的回答：大型语言模型在圆锥角膜方面的比较研究

Eye Contact Lens. 2025 Mar 1;51(3):e107-e111. doi: 10.1097/ICL.0000000000001158. Epub 2024 Dec 4.

本文引用的文献

Reply: The Emerging Role of AI in Patient Education: A Comparative Analysis of the Accuracy of Large Language Models for Pelvic Organ Prolapse.回复：人工智能在患者教育中的新兴作用：盆腔器官脱垂大语言模型准确性的比较分析

Med Princ Pract. 2024;33(5):503-504. doi: 10.1159/000540160. Epub 2024 Jul 24.

Dr. Google to Dr. ChatGPT: assessing the content and quality of artificial intelligence-generated medical information on appendicitis.谷歌博士对 ChatGPT 博士：评估人工智能生成的关于阑尾炎的医学信息的内容和质量。

Surg Endosc. 2024 May;38(5):2887-2893. doi: 10.1007/s00464-024-10739-5. Epub 2024 Mar 5.

Comparison of large language models in management advice for melanoma: Google's AI BARD, BingAI and ChatGPT.大语言模型在黑色素瘤管理建议方面的比较：谷歌的人工智能BARD、必应人工智能和ChatGPT。

Skin Health Dis. 2023 Nov 28;4(1):e313. doi: 10.1002/ski2.313. eCollection 2024 Feb.

How appropriate are answers of online chat-based artificial intelligence (ChatGPT) to common questions on colon cancer?基于在线聊天的人工智能（ChatGPT）对结肠癌常见问题的回答有多恰当？

Surgery. 2023 Nov;174(5):1273-1275. doi: 10.1016/j.surg.2023.06.005. Epub 2023 Jul 21.

Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma.评估 ChatGPT 在回答肝硬化和肝细胞癌相关问题方面的表现。

Clin Mol Hepatol. 2023 Jul;29(3):721-732. doi: 10.3350/cmh.2023.0089. Epub 2023 Mar 22.

A new readability yardstick.一种新的可读性衡量标准。

J Appl Psychol. 1948 Jun;32(3):221-33. doi: 10.1037/h0057532.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

比较ChatGPT和谷歌Gemini对常见肛门良性疾病问题的回答。

Comparing answers of ChatGPT and Google Gemini to common questions on benign anal conditions.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

CONCLUSIONS

引言

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献