Department of Diagnostic Imaging, Division of Neuroradiology, The Hospital for Sick Children, Toronto, Canada.
Department of Medical Imaging, University of Toronto, Toronto, Canada.
Can Assoc Radiol J. 2024 Feb;75(1):69-73. doi: 10.1177/08465371231171125. Epub 2023 Apr 20.
To assess the accuracy of answers provided by ChatGPT-3 when prompted with questions from the daily routine of radiologists and to evaluate the text response when ChatGPT-3 was prompted to provide references for a given answer. ChatGPT-3 (San Francisco, OpenAI) is an artificial intelligence chatbot based on a large language model (LLM) that has been designed to generate human-like text. A total of 88 questions were submitted to ChatGPT-3 using textual prompt. These 88 questions were equally dispersed across 8 subspecialty areas of radiology. The responses provided by ChatGPT-3 were assessed for correctness by cross-checking them with peer-reviewed, PubMed-listed references. In addition, the references provided by ChatGPT-3 were evaluated for authenticity. A total of 59 of 88 responses (67%) to radiological questions were correct, while 29 responses (33%) had errors. Out of 343 references provided, only 124 references (36.2%) were available through internet search, while 219 references (63.8%) appeared to be generated by ChatGPT-3. When examining the 124 identified references, only 47 references (37.9%) were considered to provide enough background to correctly answer 24 questions (37.5%). In this pilot study, ChatGPT-3 provided correct responses to questions from the daily clinical routine of radiologists in only about two thirds, while the remainder of responses contained errors. The majority of provided references were not found and only a minority of the provided references contained the correct information to answer the question. Caution is advised when using ChatGPT-3 to retrieve radiological information.
评估 ChatGPT-3 在回答放射科医生日常工作中的问题时的准确性,并评估 ChatGPT-3 在提供给定答案的参考文献时的文本回复。ChatGPT-3(旧金山,OpenAI)是一种基于大型语言模型(LLM)的人工智能聊天机器人,旨在生成类人文本。总共向 ChatGPT-3 提交了 88 个问题,使用文本提示。这 88 个问题平均分布在放射科的 8 个亚专科领域。通过与同行评议的 PubMed 列出的参考文献交叉检查,评估 ChatGPT-3 提供的回复的正确性。此外,还评估了 ChatGPT-3 提供的参考文献的真实性。对放射学问题的 88 个回复中的 59 个(67%)是正确的,而 29 个回复(33%)存在错误。在提供的 343 个参考文献中,只有 124 个参考文献(36.2%)可以通过互联网搜索获得,而 219 个参考文献(63.8%)似乎是由 ChatGPT-3 生成的。在检查 124 个已识别的参考文献时,只有 47 个参考文献(37.9%)被认为提供了足够的背景信息,可以正确回答 24 个问题(37.5%)。在这项初步研究中,ChatGPT-3 仅在大约三分之二的情况下正确回答了放射科医生日常临床工作中的问题,而其余的回复则包含错误。提供的参考文献大多数未找到,只有少数提供的参考文献包含正确的信息来回答问题。在使用 ChatGPT-3 检索放射学信息时应谨慎。
BMC Oral Health. 2025-1-17
Diagn Interv Radiol. 2025-5-12
Pathogens. 2025-7-30
Mechanobiol Med. 2023-7-5
BMC Oral Health. 2025-4-25
JAMA Netw Open. 2025-2-3
BMC Oral Health. 2025-1-17