Suppr超能文献

使用生成式大语言模型对常见外科疾病患者进行教育:ChatGPT与谷歌Gemini的比较分析

Use of generative large language models for patient education on common surgical conditions: a comparative analysis between ChatGPT and Google Gemini.

作者信息

ELSenbawy Omar Mahmoud, Patel Keval Bhavesh, Wannakuwatte Randev Ayodhya, Thota Akhila N

机构信息

Alexandria University, Alexandria, Egypt.

Narendra Modi Medical College, Ahemdabad, Gujarat, India.

出版信息

Updates Surg. 2025 Jan 15. doi: 10.1007/s13304-025-02074-8.

Abstract

There is a growing importance for patients to easily access information regarding their medical conditions to improve their understanding and participation in health care decisions. Artificial Intelligence (AI) has proven as a fast, efficient, and effective tool in educating patients regarding their health care conditions. The aim of the study is to compare the responses provided by AI tools, ChatGPT and Google Gemini, to assess for conciseness and understandability of information provided for the medical conditions Deep vein thrombosis, decubitus ulcers, and hemorrhoids. A cross-sectional original research design was conducted regarding the responses generated by ChatGPT and Google Gemini for the post-surgical complications of Deep vein thrombosis, decubitus ulcers, and hemorrhoids. Each response was evaluated by the Flesch-Kincaid calculator for total number of words, sentences, average words per sentence, average syllables per word, grade level, and ease score. Additionally, the similarity score was evaluated using QuillBot and reliability using a modified discern score. These results were then analyzed by the unpaired or two sample t-test to compare the averages between the two AI tools to conclude which one was superior. Chat GPT required a higher education level to understand as suggested by the higher grade levels and lower ease scores. The easiest brochure was for deep vein thrombosis which had the lowest ease score and highest grade level. ChatGPT displayed more similarity with information provided on the internet as calculated by the plagiarism calculator-Quill bot. The reliability score via the Modified Discern score showing both AI tools were similar. Although there is a difference in the various scores for each AI tool, based on the P values obtained there is not enough evidence to conclude the superiority of one AI tool over the other.

摘要

患者能够轻松获取有关其医疗状况的信息,对于提高他们对医疗保健决策的理解和参与度变得越来越重要。人工智能(AI)已被证明是一种快速、高效且有效的工具,可用于教育患者了解其医疗保健状况。本研究的目的是比较人工智能工具ChatGPT和谷歌Gemini提供的回答,以评估为深静脉血栓形成、褥疮和痔疮等医疗状况提供的信息的简洁性和易懂性。针对ChatGPT和谷歌Gemini生成的关于深静脉血栓形成、褥疮和痔疮术后并发症的回答,进行了一项横断面原创研究设计。每个回答都通过弗莱什-金凯德计算器评估单词总数、句子数、平均每句单词数、平均每词音节数、年级水平和易读分数。此外,使用QuillBot评估相似度分数,使用修改后的辨别分数评估可靠性。然后通过非配对或双样本t检验分析这些结果,以比较两个人工智能工具的平均值,从而得出哪个工具更优。正如较高的年级水平和较低的易读分数所表明的那样,ChatGPT需要更高的教育水平才能理解。最容易理解的手册是关于深静脉血栓形成的,其易读分数最低,年级水平最高。根据抄袭计算器Quill bot的计算,ChatGPT与互联网上提供的信息显示出更高的相似度。通过修改后的辨别分数得出的可靠性分数表明,两个人工智能工具相似。尽管每个人工智能工具的各种分数存在差异,但根据获得的P值,没有足够的证据得出一个人工智能工具优于另一个的结论。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验