Lombardo Riccardo, Gallo Giacomo, Stira Jordi, Turchi Beatrice, Santoro Giuseppe, Riolo Sara, Romagnoli Matteo, Cicione Antonio, Tema Giorgia, Pastore Antonio, Al Salhi Yazan, Fuschi Andrea, Franco Giorgio, Nacchia Antonio, Tubaro Andrea, De Nunzio Cosimo
Department of Urology, 'Sapienza' University of Rome, Rome, Italy.
Prostate Cancer Prostatic Dis. 2025 Mar;28(1):229-231. doi: 10.1038/s41391-024-00789-0. Epub 2024 Jan 16.
Chat-GPT, a natural language processing (NLP) tool created by Open-AI, can potentially be used as a quick source for obtaining information related to prostate cancer. This study aims to analyze the quality and appropriateness of Chat-GPT's responses to inquiries related to prostate cancer compared to those of the European Urology Association's (EAU) 2023 prostate cancer guidelines. Overall, 195 questions were prepared according to the recommendations gathered in the prostate cancer section of the EAU 2023 Guideline. All questions were systematically presented to Chat-GPT's August 3 Version, and two expert urologists independently assessed and assigned scores ranging from 1 to 4 to each response (1: completely correct, 2: correct but inadequate, 3: a mix of correct and misleading information, and 4: completely incorrect). Sub-analysis per chapter and per grade of recommendation were performed. Overall, 195 recommendations were evaluated. Overall, 50/195 (26%) were completely correct, 51/195 (26%) correct but inadequate, 47/195 (24%) a mix of correct and misleading and 47/195 (24%) incorrect. When looking at different chapters Open AI was particularly accurate in answering questions on follow-up and QoL. Worst performance was recorded for the diagnosis and treatment chapters with respectively 19% and 30% of the answers completely incorrect. When looking at the strength of recommendation, no differences in terms of accuracy were recorded when comparing weak and strong recommendations (p > 0,05). Chat-GPT has a poor accuracy when answering questions on the PCa EAU guidelines recommendations. Future studies should assess its performance after adequate training.
Chat-GPT是OpenAI创建的一种自然语言处理(NLP)工具,有可能被用作获取前列腺癌相关信息的快速来源。本研究旨在分析Chat-GPT对前列腺癌相关询问的回答质量和适当性,并与欧洲泌尿外科学会(EAU)2023年前列腺癌指南的回答进行比较。总体而言,根据EAU 2023年指南前列腺癌章节中收集的建议准备了195个问题。所有问题都系统地提交给Chat-GPT的8月3日版本,两位泌尿外科专家独立评估并为每个回答分配1至4分(1:完全正确;2:正确但不充分;3:正确信息与误导性信息混合;4:完全错误)。对每个章节和每个推荐等级进行了亚分析。总体而言,评估了195条推荐。总体而言,195条中有50条(26%)完全正确,51条(26%)正确但不充分,47条(24%)正确信息与误导性信息混合,47条(24%)错误。在查看不同章节时,OpenAI在回答随访和生活质量问题方面特别准确。诊断和治疗章节的表现最差,分别有19%和30%的答案完全错误。在查看推荐强度时,比较弱推荐和强推荐时,在准确性方面没有差异(p>0.05)。Chat-GPT在回答关于前列腺癌EAU指南推荐的问题时准确性较差。未来的研究应在充分训练后评估其性能。