• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人机大战:ChatGPT 和必应 AI 针对肾结石相关问题的对比分析。

Battle of the bots: a comparative analysis of ChatGPT and bing AI for kidney stone-related questions.

机构信息

Department of Urology, University of Kansas Medical Center, Kansas City, KS, USA.

Department of Urology, University of Florida College of Medicine, Gainesville, FL, USA.

出版信息

World J Urol. 2024 Oct 29;42(1):600. doi: 10.1007/s00345-024-05326-1.

DOI:10.1007/s00345-024-05326-1
PMID:39470812
Abstract

OBJECTIVES

To evaluate and compare the performance of ChatGPT™ (Open AI) and Bing AI™ (Microsoft) for responding to kidney stone treatment-related questions in accordance with the American Urological Association (AUA) guidelines and assess factors such as appropriateness, emphasis on consulting healthcare providers, references, and adherence to guidelines by each chatbot.

METHODS

We developed 20 kidney stone evaluation and treatment-related questions based on the AUA Surgical Management of Stones guideline. Questions were asked to ChatGPT and Bing AI chatbots. We compared their responses utilizing the brief DISCERN tool as well as response appropriateness.

RESULTS

ChatGPT significantly outperformed Bing AI for questions 1-3, which evaluate clarity, achievement, and relevance of responses (12.77 ± 1.71 vs. 10.17 ± 3.27; p < 0.01). In contrast, Bing AI always incorporated references, whereas ChatGPT never did. Consequently, the results for questions 4-6, which evaluated the quality of sources, were consistently favored Bing AI over ChatGPT (10.8 vs. 4.28; p < 0.01). Notably, neither chatbot offered guidance against guidelines for pre-operative testing. However, recommendations against guidelines were notable for specific scenarios: 30.5% for the treatment of adults with ureteral stones, 52.5% for adults with renal stones, and 20.5% for all patient treatment.

CONCLUSIONS

ChatGPT significantly outperformed Bing AI in terms of providing responses with clear aim, achieving such aim, and relevant and appropriate responses based on AUA surgical stone management guidelines. However, Bing AI provides references, allowing information quality assessment. Additional studies are needed to further evaluate these chatbots and their potential use by clinicians and patients for urologic healthcare-related questions.

摘要

目的

根据美国泌尿外科学会 (AUA) 指南,评估和比较 ChatGPT™(Open AI)和 Bing AI™(微软)在回答肾结石治疗相关问题方面的表现,并评估每个聊天机器人的适当性、强调咨询医疗保健提供者、参考资料和遵守指南的情况。

方法

我们根据 AUA 结石手术管理指南制定了 20 个肾结石评估和治疗相关问题。向 ChatGPT 和 Bing AI 聊天机器人提出问题。我们使用简要 DISCERN 工具以及响应适当性来比较它们的响应。

结果

ChatGPT 在评估回答清晰度、实现和相关性的问题 1-3 方面明显优于 Bing AI(12.77±1.71 对 10.17±3.27;p<0.01)。相比之下,Bing AI 始终包含参考资料,而 ChatGPT 从不包含。因此,在评估资源质量的问题 4-6 中,结果始终有利于 Bing AI 而不利于 ChatGPT(10.8 对 4.28;p<0.01)。值得注意的是,两个聊天机器人都没有提供关于术前测试指南的指导。然而,对于特定情况,建议不遵守指南是值得注意的:成人输尿管结石治疗的 30.5%,成人肾结石治疗的 52.5%,以及所有患者治疗的 20.5%。

结论

根据 AUA 手术结石管理指南,ChatGPT 在提供具有明确目标、实现目标以及相关和适当的回答方面明显优于 Bing AI。然而,Bing AI 提供了参考资料,允许对信息质量进行评估。需要进一步研究来进一步评估这些聊天机器人及其在泌尿科医疗保健相关问题方面对临床医生和患者的潜在用途。

相似文献

1
Battle of the bots: a comparative analysis of ChatGPT and bing AI for kidney stone-related questions.人机大战:ChatGPT 和必应 AI 针对肾结石相关问题的对比分析。
World J Urol. 2024 Oct 29;42(1):600. doi: 10.1007/s00345-024-05326-1.
2
Bing chat for kidney stone management questions based on the AUA guidelines: a comparison of chatbot conversation style modes.基于美国泌尿外科学会(AUA)指南的用于肾结石管理问题的必应聊天:聊天机器人对话风格模式的比较
World J Urol. 2025 Mar 6;43(1):151. doi: 10.1007/s00345-025-05533-4.
3
Comparative analysis of artificial intelligence chatbot recommendations for urolithiasis management: A study of EAU guideline compliance.人工智能聊天机器人对尿石症管理建议的比较分析:一项关于欧洲泌尿外科学会指南依从性的研究
Fr J Urol. 2024 Jul;34(7-8):102666. doi: 10.1016/j.fjurol.2024.102666. Epub 2024 Jun 5.
4
Performance of Artificial Intelligence Chatbots on Glaucoma Questions Adapted From Patient Brochures.人工智能聊天机器人对改编自患者手册的青光眼问题的回答情况。
Cureus. 2024 Mar 23;16(3):e56766. doi: 10.7759/cureus.56766. eCollection 2024 Mar.
5
Beyond the Hype-The Actual Role and Risks of AI in Today's Medical Practice: Comparative-Approach Study.超越炒作——人工智能在当今医学实践中的实际作用和风险:比较研究方法
JMIR AI. 2024 Jan 22;3:e49082. doi: 10.2196/49082.
6
Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.人工智能聊天机器人对输精管切除术相关问题回答的准确性和可读性:公众需谨慎。
Cureus. 2024 Aug 28;16(8):e67996. doi: 10.7759/cureus.67996. eCollection 2024 Aug.
7
Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study.前瞻性评估 4 种大型语言模型聊天机器人对患者关于急救护理问题的回答的准确性:实验性对比研究。
J Med Internet Res. 2024 Nov 4;26:e60291. doi: 10.2196/60291.
8
Physician Assessment of ChatGPT and Bing Answers to American Cancer Society's Questions to Ask About Your Cancer.医生对 ChatGPT 和 Bing 回答美国癌症协会关于癌症问题的评估。
Am J Clin Oncol. 2024 Jan 1;47(1):17-21. doi: 10.1097/COC.0000000000001050. Epub 2023 Oct 12.
9
Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI's ChatGPT, Google Bard, and Microsoft Bing AI Chat.人工智能聊天工具在判定紧急情况方面的效能:OpenAI的ChatGPT、谷歌巴德和微软必应人工智能聊天工具的比较
Cureus. 2023 Sep 18;15(9):e45473. doi: 10.7759/cureus.45473. eCollection 2023 Sep.
10
Talking technology: exploring chatbots as a tool for cataract patient education.技术漫谈:探索聊天机器人作为白内障患者教育工具的作用
Clin Exp Optom. 2025 Jan;108(1):56-64. doi: 10.1080/08164622.2023.2298812. Epub 2024 Jan 9.

引用本文的文献

1
Large language models in nephrology: applications and challenges in chronic kidney disease management.肾脏病学中的大语言模型:慢性肾脏病管理中的应用与挑战
Ren Fail. 2025 Dec;47(1):2555686. doi: 10.1080/0886022X.2025.2555686. Epub 2025 Sep 7.
2
Battle of the artificial intelligence: a comprehensive comparative analysis of DeepSeek and ChatGPT for urinary incontinence-related questions.人工智能之战:针对尿失禁相关问题对DeepSeek和ChatGPT的全面比较分析
Front Public Health. 2025 Jul 23;13:1605908. doi: 10.3389/fpubh.2025.1605908. eCollection 2025.
3
Use of Artificial Intelligence Methods for Improved Diagnosis of Urinary Tract Infections and Urinary Stone Disease.

本文引用的文献

1
Is ChatGPT accurate and reliable in answering questions regarding head and neck cancer?ChatGPT在回答有关头颈癌的问题时准确可靠吗?
Front Oncol. 2023 Dec 1;13:1256459. doi: 10.3389/fonc.2023.1256459. eCollection 2023.
2
Correspondence on Letter regarding "Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma".关于“评估ChatGPT回答肝硬化和肝细胞癌相关问题的表现”信件的通信
Clin Mol Hepatol. 2024 Jan;30(1):124-125. doi: 10.3350/cmh.2023.0470. Epub 2023 Nov 14.
3
Quality of information and appropriateness of ChatGPT outputs for urology patients.
使用人工智能方法改善尿路感染和尿路结石病的诊断
J Clin Med. 2025 Jul 12;14(14):4942. doi: 10.3390/jcm14144942.
4
What is the role of large language models in the management of urolithiasis?: a review.大语言模型在尿石症管理中的作用是什么?:一项综述。
Urolithiasis. 2025 May 15;53(1):92. doi: 10.1007/s00240-025-01761-w.
针对泌尿外科患者的ChatGPT输出信息质量及适用性
Prostate Cancer Prostatic Dis. 2024 Mar;27(1):103-108. doi: 10.1038/s41391-023-00705-y. Epub 2023 Jul 29.
4
Caution! AI Bot Has Entered the Patient Chat: ChatGPT Has Limitations in Providing Accurate Urologic Healthcare Advice.注意!人工智能机器人已进入患者聊天界面:ChatGPT在提供准确的泌尿科医疗建议方面存在局限性。
Urology. 2023 Oct;180:278-284. doi: 10.1016/j.urology.2023.07.010. Epub 2023 Jul 17.
5
The Advent of Generative Language Models in Medical Education.生成式语言模型在医学教育中的出现。
JMIR Med Educ. 2023 Jun 6;9:e48163. doi: 10.2196/48163.
6
ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations.医学领域的ChatGPT:其应用、优势、局限性、未来前景及伦理考量概述
Front Artif Intell. 2023 May 4;6:1169595. doi: 10.3389/frai.2023.1169595. eCollection 2023.
7
FUTURE OF THE LANGUAGE MODELS IN HEALTHCARE: THE ROLE OF CHATGPT.语言模型在医疗保健领域的未来:ChatGPT 的作用。
Arq Bras Cir Dig. 2023 May 8;36:e1727. doi: 10.1590/0102-672020230002e1727. eCollection 2023.
8
Assessing the Accuracy of Responses by the Language Model ChatGPT to Questions Regarding Bariatric Surgery.评估语言模型 ChatGPT 对肥胖症手术相关问题回答的准确性。
Obes Surg. 2023 Jun;33(6):1790-1796. doi: 10.1007/s11695-023-06603-5. Epub 2023 Apr 27.
9
Can the ChatGPT and other large language models with internet-connected database solve the questions and concerns of patient with prostate cancer and help democratize medical knowledge?ChatGPT和其他连接互联网数据库的大型语言模型能否解决前列腺癌患者的问题和担忧,并有助于普及医学知识?
J Transl Med. 2023 Apr 19;21(1):269. doi: 10.1186/s12967-023-04123-5.
10
Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios.评估 ChatGPT 在医疗保健中的可行性:对多个临床和研究场景的分析。
J Med Syst. 2023 Mar 4;47(1):33. doi: 10.1007/s10916-023-01925-4.