Suppr超能文献

人工智能 ChatGPT 在肿瘤学检查问题中的准确性。

The Accuracy of Artificial Intelligence ChatGPT in Oncology Examination Questions.

机构信息

Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; New York Proton Center, New York, New York.

New York Proton Center, New York, New York.

出版信息

J Am Coll Radiol. 2024 Nov;21(11):1800-1804. doi: 10.1016/j.jacr.2024.07.011. Epub 2024 Aug 2.

Abstract

The aim of this study is to assess the accuracy of Chat Generative Pretrained Transformer (ChatGPT) in response to oncology examination questions in the setting of one-shot learning. Consecutive national radiation oncology in-service multiple-choice examinations were collected and inputted into ChatGPT 4o and ChatGPT 3.5 to determine ChatGPT's answers. ChatGPT's answers were then compared with the answer keys to determine whether ChatGPT correctly or incorrectly answered each question and to determine if improvements in responses were seen with the newer ChatGPT version. A total of 600 consecutive questions were inputted into ChatGPT. ChatGPT 4o answered 72.2% questions correctly, whereas 3.5 answered 53.8% questions correctly. There was a significant difference in performance by question category (P < .01). ChatGPT performed poorer with respect to knowledge of landmark studies and treatment recommendations and planning. ChatGPT is a promising technology, with the latest version showing marked improvement. Although it still has limitations, with further evolution, it may be considered a reliable resource for medical training and decision making in the oncology space.

摘要

本研究旨在评估 Chat 生成式预训练转换器(ChatGPT)在单次学习环境下对肿瘤学考试问题的回答准确性。连续收集全国放射肿瘤学在职多项选择题考试,并将其输入 ChatGPT 4o 和 ChatGPT 3.5 以确定 ChatGPT 的答案。然后将 ChatGPT 的答案与答案关键进行比较,以确定 ChatGPT 是否正确或错误地回答了每个问题,并确定较新版本的 ChatGPT 是否会提高回答的准确性。总共输入了 600 个连续问题到 ChatGPT 中。ChatGPT 4o 正确回答了 72.2%的问题,而 3.5 正确回答了 53.8%的问题。不同问题类别之间的表现有显著差异(P <.01)。ChatGPT 在标志性研究和治疗建议及计划方面的知识方面表现较差。ChatGPT 是一种很有前途的技术,最新版本显示出显著的改进。尽管它仍然存在局限性,但随着进一步的发展,它可能被认为是肿瘤学领域医学培训和决策的可靠资源。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验