关于小儿原发性夜间遗尿症的三种聊天机器人回复的对比分析。

Comperative analysis of three chatbot responses on pediatric primary nocturnal enuresis.

作者信息

Boztas Asya Eylem, Ensari Esra

机构信息

Health and Science University Dr. Behcet Uz Pediatric Diseases and Surgery Training and Research Hospital, Department of Pediatric Surgery, Kultur mh. Dr.Mustafa Enver Bey cd. No:32 D:10 Konak, Izmir, Turkey.

Antalya City Hospital, Department of Paediatric Nephrology, 07080, Antalya, Turkey.

出版信息

J Pediatr Urol. 2025 Apr 30. doi: 10.1016/j.jpurol.2025.04.031.

DOI:10.1016/j.jpurol.2025.04.031

PMID:40355311

Abstract

BACKGROUND

The purpose of the study was to evaluate both the accuracy and reproducibility of the answers given by ChatGPT-4o®, Gemini® and Copilot® to frequently asked questions about pediatric primary enuresis nocturna.

METHODS

Forty frequently asked questions about primary nocturnal enuresis were asked 2 times, one week apart, on ChatGPT-4o, Gemini and Copilot. One of each pediatric surgeon and nephrologist independently scored the answers into 4 groups: comprehensive/correct (1), incomplete/partially correct (2), a mix of accurate and inaccurate/misleading (3), and completely inaccurate/irrelevant (4). The accuracy and reproducibility of each chatbots answers were evaluated.

RESULTS

In comparison of these most common used chatbots, the order of completely correct response rates from highest to lowest was Chat GPT-4o and followed by Copilot and Gemini. With an accuracy percentage of 92.5 %, ChatGPT-4o gave the most accurate responses of any AI chatbot. Gemini answered 50 % of questions correctly. Copilot was the weakest successful chatbot in answering questions about enuresis nocturna with 45 % of completely accurate answer ratio. Besides Copilot has a ratio of 2.5 % for completely inaccurate/irrelevant response. Reproducibility of ChatGPT-4o, Gemini and Copilots were 85 %, 77.5 %, 70 % respectively.

CONCLUSION

ChatGPT-4o is more successful in providing a high percentage of accurate responses regarding nocturnal enuresis. Both patients and their parents can use it, especially for simple, low-complexity medical questions. However, it should be used alongside expert healthcare proffesional.

摘要

背景

本研究的目的是评估ChatGPT-4o®、Gemini®和Copilot®对小儿原发性夜间遗尿症常见问题给出答案的准确性和可重复性。

方法

在ChatGPT-4o、Gemini和Copilot上，相隔一周两次询问40个关于原发性夜间遗尿症的常见问题。每位小儿外科医生和肾脏病学家分别将答案分为4组：全面/正确（1）、不完整/部分正确（2）、准确与不准确/误导性混合（3）以及完全不准确/不相关（4）。评估每个聊天机器人答案的准确性和可重复性。

结果

在这些最常用的聊天机器人的比较中，完全正确回答率从高到低的顺序是Chat GPT-4o，其次是Copilot和Gemini。ChatGPT-4o的准确率为92.5%，是所有人工智能聊天机器人中给出最准确回答的。Gemini正确回答了50%的问题。Copilot是回答夜间遗尿症问题最弱的成功聊天机器人，完全准确答案的比例为45%。此外，Copilot完全不准确/不相关回答的比例为2.5%。ChatGPT-4o、Gemini和Copilot的可重复性分别为85%、77.5%、70%。

结论

ChatGPT-4o在提供关于夜间遗尿症的高比例准确回答方面更成功。患者及其父母都可以使用它，特别是对于简单、低复杂度的医学问题。然而，它应该与专业医疗保健人员一起使用。

相似文献

Comperative analysis of three chatbot responses on pediatric primary nocturnal enuresis.关于小儿原发性夜间遗尿症的三种聊天机器人回复的对比分析。

J Pediatr Urol. 2025 Apr 30. doi: 10.1016/j.jpurol.2025.04.031.

Information from digital and human sources: A comparison of chatbot and clinician responses to orthodontic questions.来自数字和人工来源的信息：聊天机器人与临床医生对正畸问题回答的比较。

Am J Orthod Dentofacial Orthop. 2025 May 6. doi: 10.1016/j.ajodo.2025.04.008.

Assessment of readability, reliability, and quality of ChatGPT®, BARD®, Gemini®, Copilot®, Perplexity® responses on palliative care.评估 ChatGPT®、BARD®、 Gemini®、Copilot®、Perplexity® 在姑息治疗方面的可读性、可靠性和质量。

Medicine (Baltimore). 2024 Aug 16;103(33):e39305. doi: 10.1097/MD.0000000000039305.

Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in Gross Anatomy course: Comparative analysis.在大体解剖学课程中使用大语言模型（ChatGPT、Copilot、PaLM、Bard和Gemini）：比较分析

Clin Anat. 2025 Mar;38(2):200-210. doi: 10.1002/ca.24244. Epub 2024 Nov 21.

Dr. Chatbot: Investigating the Quality and Quantity of Responses Generated by Three AI Chatbots to Prompts Regarding Carpal Tunnel Syndrome.聊天机器人博士：调查三款人工智能聊天机器人针对腕管综合征提示所生成回复的质量和数量。

Cureus. 2025 Mar 24;17(3):e81068. doi: 10.7759/cureus.81068. eCollection 2025 Mar.

Comparative Evaluation of Chatbot Responses on Coronary Artery Disease.冠心病相关聊天机器人回复的比较评估

Turk Kardiyol Dern Ars. 2025 Jan;53(1):35-43. doi: 10.5543/tkda.2024.78131.

Comparison of ChatGPT-4o, Google Gemini 1.5 Pro, Microsoft Copilot Pro, and Ophthalmologists in the management of uveitis and ocular inflammation: A comparative study of large language models.ChatGPT-4o、谷歌Gemini 1.5 Pro、微软Copilot Pro与眼科医生在葡萄膜炎和眼部炎症管理中的比较：大型语言模型的对比研究

J Fr Ophtalmol. 2025 Apr;48(4):104468. doi: 10.1016/j.jfo.2025.104468. Epub 2025 Mar 13.

Can artificial intelligence models serve as patient information consultants in orthodontics?人工智能模型能否在正畸学中充当患者信息顾问？

BMC Med Inform Decis Mak. 2024 Jul 29;24(1):211. doi: 10.1186/s12911-024-02619-8.

Claude, ChatGPT, Copilot, and Gemini performance versus students in different topics of neuroscience.克劳德、ChatGPT、Copilot和Gemini在神经科学不同主题上与学生的表现对比。

Adv Physiol Educ. 2025 Jun 1;49(2):430-437. doi: 10.1152/advan.00093.2024. Epub 2025 Jan 17.

Performance of 4 Artificial Intelligence Chatbots in Answering Endodontic Questions.4款人工智能聊天机器人回答牙髓病学问题的表现

J Endod. 2025 May;51(5):602-608. doi: 10.1016/j.joen.2025.01.002. Epub 2025 Jan 13.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

关于小儿原发性夜间遗尿症的三种聊天机器人回复的对比分析。

Comperative analysis of three chatbot responses on pediatric primary nocturnal enuresis.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献