Charles Sturt University, Wagga Wagga, New South Wales, Australia
J Nucl Med Technol. 2023 Dec 5;51(4):314-317. doi: 10.2967/jnmt.123.266485.
The emergence of ChatGPT has challenged academic integrity in teaching institutions, including those providing nuclear medicine training. Although previous evaluations of ChatGPT have suggested a limited scope for academic writing, the March 2023 release of generative pretrained transformer (GPT)-4 promises enhanced capabilities that require evaluation. Examinations (final and calculation) and written assignments for nuclear medicine subjects were tested using GPT-3.5 and GPT-4. GPT-3.5 and GPT-4 responses were evaluated by Turnitin software for artificial intelligence scores, marked against standardized rubrics, and compared with the mean performance of student cohorts. ChatGPT powered by GPT-3.5 performed poorly in calculation examinations (31.4%), compared with GPT-4 (59.1%). GPT-3.5 failed each of 3 written tasks (39.9%), whereas GPT-4 passed each task (56.3%). Although GPT-3.5 poses a minimal risk to academic integrity, its usefulness as a cheating tool can be significantly enhanced by GPT-4 but remains prone to hallucination and fabrication.
ChatGPT 的出现对教学机构的学术诚信提出了挑战,包括提供核医学培训的机构。尽管之前对 ChatGPT 的评估表明其在学术写作方面的应用范围有限,但 2023 年 3 月发布的生成式预训练转换器 (GPT)-4 承诺具有增强的功能,需要进行评估。使用 GPT-3.5 和 GPT-4 对核医学科目的考试(期末和计算)和书面作业进行了测试。GPT-3.5 和 GPT-4 的回复由 Turnitin 软件进行人工智能评分,根据标准化评分标准进行标记,并与学生群体的平均表现进行比较。由 GPT-3.5 提供支持的 ChatGPT 在计算考试中的表现(31.4%)不如 GPT-4(59.1%)。GPT-3.5 未能通过 3 项书面任务中的每一项(39.9%),而 GPT-4 则通过了每项任务(56.3%)。尽管 GPT-3.5 对学术诚信构成的风险很小,但通过 GPT-4 可以显著增强其作为作弊工具的用途,但它仍然容易出现幻觉和捏造。