Suppr超能文献

人工智能在膀胱输尿管反流疾病中的应用:指南依从性的比较研究

Use of Artificial Intelligence in Vesicoureteral Reflux Disease: A Comparative Study of Guideline Compliance.

作者信息

Sarikaya Mehmet, Ozcan Siki Fatma, Ciftci Ilhan

机构信息

Department of Pediatric Surgery, Faculty of Medicine, Selcuk University, Konya 42100, Turkey.

出版信息

J Clin Med. 2025 Mar 30;14(7):2378. doi: 10.3390/jcm14072378.

Abstract

This study aimed to evaluate the compliance of four different artificial intelligence applications (ChatGPT-4.0, Bing AI, Google Bard, and Perplexity) with the American Urological Association (AUA) vesicoureteral reflux (VUR) management guidelines. Fifty-one questions derived from the AUA guidelines were asked of each AI application. Two experienced paediatric surgeons independently scored the responses using a five-point Likert scale. Inter-rater agreement was analysed using the intraclass correlation coefficient (ICC). ChatGPT-4.0, Bing AI, Google Bard, and Perplexity received mean scores of 4.91, 4.85, 4.75 and 4.70 respectively. There was no statistically significant difference between the accuracy of the AI applications ( = 0.223). The inter-rater ICC values were above 0.9 for all platforms, indicating a high level of consistency in scoring. The evaluated AI applications agreed highly with the AUA VUR management guidelines. These results suggest that AI applications may be a potential tool for providing guideline-based recommendations in paediatric urology.

摘要

本研究旨在评估四种不同的人工智能应用程序(ChatGPT-4.0、必应人工智能、谷歌巴德和Perplexity)对美国泌尿外科学会(AUA)膀胱输尿管反流(VUR)管理指南的遵循情况。向每个人工智能应用程序提出了51个源自AUA指南的问题。两名经验丰富的儿科外科医生使用五点李克特量表对回答进行独立评分。使用组内相关系数(ICC)分析评分者间的一致性。ChatGPT-4.0、必应人工智能、谷歌巴德和Perplexity的平均得分分别为4.91、4.85、4.75和4.70。人工智能应用程序的准确性之间没有统计学上的显著差异(=0.223)。所有平台的评分者间ICC值均高于0.9,表明评分具有高度一致性。评估的人工智能应用程序与AUA VUR管理指南高度一致。这些结果表明,人工智能应用程序可能是在小儿泌尿外科提供基于指南的建议的潜在工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0957/11989457/327ec0d6e503/jcm-14-02378-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验