Akyol Onder Esra Nagehan, Ensari Esra, Ertan Pelin
Aksaray University Training and Research Hospital, Department of Paediatric Nephrology, Aksaray, TR-68200, Turkey.
Antalya City Hospital, Department of Paediatric Nephrology, Antalya, TR-07080, Turkey.
J Pediatr Urol. 2025 Apr;21(2):504-509. doi: 10.1016/j.jpurol.2024.12.002. Epub 2024 Dec 7.
Vesicoureteral reflux (VUR) is a common congenital or acquired urinary disorder in children. Chat Generative Pre-trained Transformer (ChatGPT) is an artificial intelligence-driven platform offering medical information. This research aims to assess the reliability and readability of ChatGPT-4o's answers regarding pediatric VUR for general, non-medical audience.
Twenty of the most frequently asked English-language questions about VUR in children were used to evaluate ChatGPT-4o's responses. Two independent reviewers rated the reliability and quality using the Global Quality Scale (GQS) and a modified version of the DISCERN tool. The readability of ChatGPT responses was assessed through the Flesch Reading Ease (FRE) Score, Flesch-Kincaid Grade Level (FKGL), Gunning Fog Index (GFI), Coleman-Liau Index (CLI), and Simple Measure of Gobbledygook (SMOG).
Median mDISCERN and GQS scores were 4 (4-5) and 5 (3-5), respectively. Most of the responses of ChatGPT have moderate (55 %) and good (45 %) reliability according to the mDISCERN score and high quality (95 %) according to GQS. The mean ± standard deviation scores for FRE, FKGL, SMOG, GFI, and CLI of the text were 26 ± 12, 15 ± 2.5, 16.3 ± 2, 18.8 ± 2.9, and 15.3 ± 2.2, respectively, indicating a high level of reading difficulty.
While ChatGPT-4o offers accurate and high-quality information about pediatric VUR, its readability poses challenges, as the content is difficult to understand for a general audience.
ChatGPT provides high-quality, accessible information about VUR. However, improving readability should be a priority to make this information more user-friendly for a broader audience.
膀胱输尿管反流(VUR)是儿童常见的先天性或后天性泌尿系统疾病。Chat生成式预训练变换器(ChatGPT)是一个提供医学信息的人工智能驱动平台。本研究旨在评估ChatGPT-4o针对普通非医学受众提供的有关小儿VUR问题答案的可靠性和可读性。
使用20个关于儿童VUR最常见的英文问题来评估ChatGPT-4o的回答。两名独立评审员使用全球质量量表(GQS)和DISCERN工具的修改版对可靠性和质量进行评分。通过弗莱什易读性得分(FRE)、弗莱什-金凯德年级水平(FKGL)、冈宁雾度指数(GFI)、科尔曼-廖指数(CLI)和晦涩难懂简易度量表(SMOG)来评估ChatGPT回答的可读性。
mDISCERN和GQS评分的中位数分别为4(4-5)和5(3-5)。根据mDISCERN评分,ChatGPT的大多数回答具有中等(55%)和良好(45%)的可靠性,根据GQS则具有高质量(95%)。文本的FRE、FKGL、SMOG、GFI和CLI的平均±标准差得分分别为26±12、15±2.5、16.3±2、18.8±2.9和15.3±2.2,表明阅读难度较高。
虽然ChatGPT-4o提供了有关小儿VUR的准确且高质量的信息,但其可读性存在挑战,因为普通受众难以理解其中的内容。
ChatGPT提供了有关VUR的高质量且易于获取的信息。然而,提高可读性应成为优先事项,以使这些信息对更广泛的受众更加用户友好。