人机大战：ChatGPT 和必应 AI 针对肾结石相关问题的对比分析。

Battle of the bots: a comparative analysis of ChatGPT and bing AI for kidney stone-related questions.

机构信息

Department of Urology, University of Kansas Medical Center, Kansas City, KS, USA.

Department of Urology, University of Florida College of Medicine, Gainesville, FL, USA.

出版信息

World J Urol. 2024 Oct 29;42(1):600. doi: 10.1007/s00345-024-05326-1.

DOI:10.1007/s00345-024-05326-1

PMID:39470812

Abstract

OBJECTIVES

To evaluate and compare the performance of ChatGPT™ (Open AI) and Bing AI™ (Microsoft) for responding to kidney stone treatment-related questions in accordance with the American Urological Association (AUA) guidelines and assess factors such as appropriateness, emphasis on consulting healthcare providers, references, and adherence to guidelines by each chatbot.

METHODS

We developed 20 kidney stone evaluation and treatment-related questions based on the AUA Surgical Management of Stones guideline. Questions were asked to ChatGPT and Bing AI chatbots. We compared their responses utilizing the brief DISCERN tool as well as response appropriateness.

RESULTS

ChatGPT significantly outperformed Bing AI for questions 1-3, which evaluate clarity, achievement, and relevance of responses (12.77 ± 1.71 vs. 10.17 ± 3.27; p < 0.01). In contrast, Bing AI always incorporated references, whereas ChatGPT never did. Consequently, the results for questions 4-6, which evaluated the quality of sources, were consistently favored Bing AI over ChatGPT (10.8 vs. 4.28; p < 0.01). Notably, neither chatbot offered guidance against guidelines for pre-operative testing. However, recommendations against guidelines were notable for specific scenarios: 30.5% for the treatment of adults with ureteral stones, 52.5% for adults with renal stones, and 20.5% for all patient treatment.

CONCLUSIONS

ChatGPT significantly outperformed Bing AI in terms of providing responses with clear aim, achieving such aim, and relevant and appropriate responses based on AUA surgical stone management guidelines. However, Bing AI provides references, allowing information quality assessment. Additional studies are needed to further evaluate these chatbots and their potential use by clinicians and patients for urologic healthcare-related questions.

摘要

目的

根据美国泌尿外科学会 (AUA) 指南，评估和比较 ChatGPT™（Open AI）和 Bing AI™（微软）在回答肾结石治疗相关问题方面的表现，并评估每个聊天机器人的适当性、强调咨询医疗保健提供者、参考资料和遵守指南的情况。

方法

我们根据 AUA 结石手术管理指南制定了 20 个肾结石评估和治疗相关问题。向 ChatGPT 和 Bing AI 聊天机器人提出问题。我们使用简要 DISCERN 工具以及响应适当性来比较它们的响应。

结果

ChatGPT 在评估回答清晰度、实现和相关性的问题 1-3 方面明显优于 Bing AI（12.77±1.71 对 10.17±3.27；p<0.01）。相比之下，Bing AI 始终包含参考资料，而 ChatGPT 从不包含。因此，在评估资源质量的问题 4-6 中，结果始终有利于 Bing AI 而不利于 ChatGPT（10.8 对 4.28；p<0.01）。值得注意的是，两个聊天机器人都没有提供关于术前测试指南的指导。然而，对于特定情况，建议不遵守指南是值得注意的：成人输尿管结石治疗的 30.5%，成人肾结石治疗的 52.5%，以及所有患者治疗的 20.5%。

结论

根据 AUA 手术结石管理指南，ChatGPT 在提供具有明确目标、实现目标以及相关和适当的回答方面明显优于 Bing AI。然而，Bing AI 提供了参考资料，允许对信息质量进行评估。需要进一步研究来进一步评估这些聊天机器人及其在泌尿科医疗保健相关问题方面对临床医生和患者的潜在用途。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

人机大战：ChatGPT 和必应 AI 针对肾结石相关问题的对比分析。

Battle of the bots: a comparative analysis of ChatGPT and bing AI for kidney stone-related questions.

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

人机大战：ChatGPT 和必应 AI 针对肾结石相关问题的对比分析。

Battle of the bots: a comparative analysis of ChatGPT and bing AI for kidney stone-related questions.

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献