Department of Urogynaecology, Cork University Maternity Hospital, Cork, Ireland.
Department of Obstetrics and Gynecology, Shaare Zedek Medical Center, Affiliated with the Hebrew University School of Medicine, Jerusalem, Israel.
Int J Gynaecol Obstet. 2024 Dec;167(3):1004-1009. doi: 10.1002/ijgo.15751. Epub 2024 Jun 30.
To evaluate the effectiveness of ChatGPT in providing insights into common urinary incontinence concerns within urogynecology. By analyzing the model's responses against established benchmarks of accuracy, completeness, and safety, the study aimed to quantify its usefulness for informing patients and aiding healthcare providers.
An expert-driven questionnaire was developed, inviting urogynecologists worldwide to assess ChatGPT's answers to 10 carefully selected questions on urinary incontinence (UI). These assessments focused on the accuracy of the responses, their comprehensiveness, and whether they raised any safety issues. Subsequent statistical analyses determined the average consensus among experts and identified the proportion of responses receiving favorable evaluations (a score of 4 or higher).
Of 50 urogynecologists that were approached worldwide, 37 responded, offering insights into ChatGPT's responses on UI. The overall feedback averaged a score of 4.0, indicating a positive acceptance. Accuracy scores averaged 3.9 with 71% rated favorably, whereas comprehensiveness scored slightly higher at 4 with 74% favorable ratings. Safety assessments also averaged 4 with 74% favorable responses.
This investigation underlines ChatGPT's favorable performance across the evaluated domains of accuracy, comprehensiveness, and safety within the context of UI queries. However, despite this broadly positive reception, the study also signals a clear avenue for improvement, particularly in the precision of the provided information. Refining ChatGPT's accuracy and ensuring the delivery of more pinpointed responses are essential steps forward, aiming to bolster its utility as a comprehensive educational resource for patients and a supportive tool for healthcare practitioners.
评估 ChatGPT 在提供尿失禁相关问题的见解方面在泌尿妇科领域的有效性。通过分析模型的回答与准确性、完整性和安全性的既定基准进行比较,该研究旨在量化其在为患者提供信息和帮助医疗保健提供者方面的有用性。
开发了一份专家驱动的问卷,邀请全球的泌尿妇科医生评估 ChatGPT 对 10 个精心挑选的尿失禁问题的回答。这些评估集中在回答的准确性、全面性以及是否存在任何安全问题上。随后的统计分析确定了专家的平均共识,并确定了收到有利评价(评分 4 或更高)的回答比例。
在全球范围内联系的 50 名泌尿妇科医生中,有 37 名做出了回应,提供了对 ChatGPT 回答尿失禁问题的见解。整体反馈平均评分为 4.0,表明了积极的接受度。准确性评分平均为 3.9,其中 71%的评分较好,而全面性评分略高,为 4,其中 74%的评分较好。安全性评估的平均评分为 4,其中 74%的回答较好。
这项调查强调了 ChatGPT 在准确性、全面性和安全性方面的有利表现,符合 UI 查询的要求。然而,尽管这种广泛的接受度,研究也指出了一个明显的改进途径,特别是在提供信息的准确性方面。提高 ChatGPT 的准确性并确保提供更精确的回答是向前推进的重要步骤,旨在增强其作为患者全面教育资源和医疗保健从业者支持工具的实用性。