Suppr超能文献

核医学教育中的 GPT-4:它比 GPT-3.5 表现更好吗?

GPT-4 in Nuclear Medicine Education: Does It Outperform GPT-3.5?

机构信息

Charles Sturt University, Wagga Wagga, New South Wales, Australia

出版信息

J Nucl Med Technol. 2023 Dec 5;51(4):314-317. doi: 10.2967/jnmt.123.266485.

Abstract

The emergence of ChatGPT has challenged academic integrity in teaching institutions, including those providing nuclear medicine training. Although previous evaluations of ChatGPT have suggested a limited scope for academic writing, the March 2023 release of generative pretrained transformer (GPT)-4 promises enhanced capabilities that require evaluation. Examinations (final and calculation) and written assignments for nuclear medicine subjects were tested using GPT-3.5 and GPT-4. GPT-3.5 and GPT-4 responses were evaluated by Turnitin software for artificial intelligence scores, marked against standardized rubrics, and compared with the mean performance of student cohorts. ChatGPT powered by GPT-3.5 performed poorly in calculation examinations (31.4%), compared with GPT-4 (59.1%). GPT-3.5 failed each of 3 written tasks (39.9%), whereas GPT-4 passed each task (56.3%). Although GPT-3.5 poses a minimal risk to academic integrity, its usefulness as a cheating tool can be significantly enhanced by GPT-4 but remains prone to hallucination and fabrication.

摘要

ChatGPT 的出现对教学机构的学术诚信提出了挑战,包括提供核医学培训的机构。尽管之前对 ChatGPT 的评估表明其在学术写作方面的应用范围有限,但 2023 年 3 月发布的生成式预训练转换器 (GPT)-4 承诺具有增强的功能,需要进行评估。使用 GPT-3.5 和 GPT-4 对核医学科目的考试(期末和计算)和书面作业进行了测试。GPT-3.5 和 GPT-4 的回复由 Turnitin 软件进行人工智能评分,根据标准化评分标准进行标记,并与学生群体的平均表现进行比较。由 GPT-3.5 提供支持的 ChatGPT 在计算考试中的表现(31.4%)不如 GPT-4(59.1%)。GPT-3.5 未能通过 3 项书面任务中的每一项(39.9%),而 GPT-4 则通过了每项任务(56.3%)。尽管 GPT-3.5 对学术诚信构成的风险很小,但通过 GPT-4 可以显著增强其作为作弊工具的用途,但它仍然容易出现幻觉和捏造。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验