• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用GPT-4的循环系统和呼吸系统解剖学考试模型:一项医学院研究

Anatomy exam model for the circulatory and respiratory systems using GPT-4: a medical school study.

作者信息

Tekin Ayla, Karamus Nizameddin Fatih, Çolak Tuncay

机构信息

Faculty of Medicine, Anatomy Department, Kocaeli University, Umuttepe Campus, Kocaeli, 41001, Turkey.

Faculty of Medicine, Anatomy Department, Altinbas University, İstanbul, Turkey.

出版信息

Surg Radiol Anat. 2025 Jun 10;47(1):158. doi: 10.1007/s00276-025-03667-z.

DOI:10.1007/s00276-025-03667-z
PMID:40495075
Abstract

PURPOSE

The study aimed to evaluate the effectiveness of anatomy multiple-choice questions (MCQs) generated by GPT-4, focused on their methodological appropriateness and alignment with the cognitive levels defined by Bloom's revised taxonomy to enhance assessment.

METHODS

The assessment questions developed for medical students were created utilizing GPT-4, comprising 240 MCQs organized into subcategories consistent with Bloom's revised taxonomy. When designing prompts to create MCQs, details about the lesson's purpose, learning objectives, and students' prior experiences were included to ensure the questions were contextually appropriate. A set of 30 MCQs was randomly selected from the generated questions for testing. A total of 280 students participated in the examination, which assessed the difficulty index of the MCQs, the item discrimination index, and the overall test difficulty level. Expert anatomists examined the taxonomy accuracy of GPT-4's questions.

RESULTS

Students achieved a median score of 50 (range, 36.67-60) points on the test. The test's internal consistency, assessed by KR-20, was 0.737. The average difficulty of the test was 0.5012. Results show difficulty and discrimination indices for each AI-generated question. Expert anatomists' taxonomy-based classifications matched GPT-4's 26.6%. Meanwhile, 80.9% of students found the questions were clear, and 85.8% showed interest in retaking the assessment exam.

CONCLUSION

This study demonstrates GPT-4's significant potential for generating medical education exam questions. While it effectively assesses basic knowledge recall, it fails to sufficiently evaluate higher-order cognitive processes outlined in Bloom's revised taxonomy. Future research should consider alternative methods that combine AI with expert evaluation and specialized multimodal models.

摘要

目的

本研究旨在评估由GPT-4生成的解剖学多项选择题(MCQ)的有效性,重点关注其方法的适当性以及与布鲁姆修订分类法所定义的认知水平的一致性,以加强评估。

方法

利用GPT-4为医学生开发评估问题,包括240道多项选择题,这些题目按照与布鲁姆修订分类法一致的子类别进行组织。在设计生成多项选择题的提示时,纳入了课程目的、学习目标和学生先前经验的详细信息,以确保问题在情境上是合适的。从生成的问题中随机抽取30道多项选择题进行测试。共有280名学生参加了考试,该考试评估了多项选择题的难度指数、项目区分指数以及整体测试难度水平。解剖学专家检查了GPT-4问题的分类准确性。

结果

学生在测试中的中位数分数为50分(范围为36.67 - 60分)。通过KR-20评估的测试内部一致性为0.737。测试的平均难度为0.5012。结果显示了每个由人工智能生成的问题的难度和区分指数。解剖学专家基于分类法的分类与GPT-4的分类匹配度为26.6%。同时,80.9%的学生认为问题清晰,85.8%的学生表示有兴趣再次参加评估考试。

结论

本研究证明了GPT-4在生成医学教育考试问题方面具有巨大潜力。虽然它能有效评估基础知识的回忆,但未能充分评估布鲁姆修订分类法中概述的高阶认知过程。未来的研究应考虑将人工智能与专家评估和专门的多模态模型相结合的替代方法。

相似文献

1
Anatomy exam model for the circulatory and respiratory systems using GPT-4: a medical school study.使用GPT-4的循环系统和呼吸系统解剖学考试模型:一项医学院研究
Surg Radiol Anat. 2025 Jun 10;47(1):158. doi: 10.1007/s00276-025-03667-z.
2
Climbing Bloom's taxonomy pyramid: Lessons from a graduate histology course.攀登布鲁姆教育目标分类学金字塔:研究生组织学课程的经验教训。
Anat Sci Educ. 2017 Sep;10(5):456-464. doi: 10.1002/ase.1685. Epub 2017 Feb 23.
3
Assessing ChatGPT's Mastery of Bloom's Taxonomy Using Psychosomatic Medicine Exam Questions: Mixed-Methods Study.使用心身医学考试问题评估 ChatGPT 对布鲁姆教育目标分类法的掌握程度:混合方法研究。
J Med Internet Res. 2024 Jan 23;26:e52113. doi: 10.2196/52113.
4
AI versus human-generated multiple-choice questions for medical education: a cohort study in a high-stakes examination.用于医学教育的人工智能生成与人工生成的多项选择题:一项在高风险考试中的队列研究
BMC Med Educ. 2025 Feb 8;25(1):208. doi: 10.1186/s12909-025-06796-6.
5
Comparing the Performance of ChatGPT-4 and Medical Students on MCQs at Varied Levels of Bloom's Taxonomy.比较ChatGPT-4与医学生在布鲁姆教育目标分类法不同层次多项选择题上的表现。
Adv Med Educ Pract. 2024 May 10;15:393-400. doi: 10.2147/AMEP.S457408. eCollection 2024.
6
The Blooming Anatomy Tool (BAT): A discipline-specific rubric for utilizing Bloom's taxonomy in the design and evaluation of assessments in the anatomical sciences.绽放解剖学工具(BAT):一种用于在解剖学科学评估的设计和评估中运用布鲁姆分类法的特定学科评分标准。
Anat Sci Educ. 2015 Nov-Dec;8(6):493-501. doi: 10.1002/ase.1507. Epub 2014 Dec 16.
7
Comparing the performance of artificial intelligence learning models to medical students in solving histology and embryology multiple choice questions.比较人工智能学习模型与医学生在解决组织学和胚胎学选择题方面的表现。
Ann Anat. 2024 Jun;254:152261. doi: 10.1016/j.aanat.2024.152261. Epub 2024 Mar 21.
8
Can ChatGPT Generate Acceptable Case-Based Multiple-Choice Questions for Medical School Anatomy Exams? A Pilot Study on Item Difficulty and Discrimination.ChatGPT能否生成适用于医学院解剖学考试的基于案例的多项选择题?关于题目难度和区分度的初步研究。
Clin Anat. 2025 May;38(4):505-510. doi: 10.1002/ca.24271. Epub 2025 Mar 24.
9
Role of comprehension on performance at higher levels of Bloom's taxonomy: Findings from assessments of healthcare professional students.布卢姆认知目标分类学高阶水平表现与理解能力的关系:对医疗专业学生评估的研究结果。
Anat Sci Educ. 2018 Sep;11(5):433-444. doi: 10.1002/ase.1768. Epub 2018 Jan 18.
10
Measuring the impact of the flipped anatomy classroom: The importance of categorizing an assessment by Bloom's taxonomy.衡量翻转解剖学课堂的影响:依据布鲁姆分类法对评估进行分类的重要性。
Anat Sci Educ. 2017 Mar;10(2):170-175. doi: 10.1002/ase.1635. Epub 2016 Jul 18.

本文引用的文献

1
Leveraging Generative Artificial Intelligence to Improve Motivation and Retrieval in Higher Education Learners.利用生成式人工智能提高高等教育学习者的学习动机和检索能力。
JMIR Med Educ. 2025 Mar 11;11:e59210. doi: 10.2196/59210.
2
Quality assurance and validity of AI-generated single best answer questions.人工智能生成的最佳单一答案问题的质量保证与有效性
BMC Med Educ. 2025 Feb 25;25(1):300. doi: 10.1186/s12909-025-06881-w.
3
Using artificial intelligence to provide a 'flipped assessment' approach to medical education learning opportunities.
利用人工智能为医学教育学习机会提供一种“翻转评估”方法。
Med Teach. 2025 Aug;47(8):1377-1384. doi: 10.1080/0142159X.2024.2434101. Epub 2024 Dec 1.
4
ChatGPT efficacy for answering musculoskeletal anatomy questions: a study evaluating quality and consistency between raters and timepoints.ChatGPT 在回答肌肉骨骼解剖学问题方面的效果:一项评估评分者和时间点之间的质量和一致性的研究。
Surg Radiol Anat. 2024 Nov;46(11):1885-1890. doi: 10.1007/s00276-024-03477-9. Epub 2024 Sep 12.
5
ChatGPT versus a customized AI chatbot (Anatbuddy) for anatomy education: A comparative pilot study.ChatGPT 与定制 AI 聊天机器人(Anatbuddy)在解剖学教育中的比较:一项初步研究。
Anat Sci Educ. 2024 Oct;17(7):1396-1405. doi: 10.1002/ase.2502. Epub 2024 Aug 21.
6
Introducing AnatomyGPT: A customized artificial intelligence application for anatomical sciences education.介绍 AnatomyGPT:一个用于解剖科学教育的定制人工智能应用程序。
Clin Anat. 2024 Sep;37(6):661-669. doi: 10.1002/ca.24178. Epub 2024 May 9.
7
A scoping review of artificial intelligence in medical education: BEME Guide No. 84.人工智能在医学教育中的应用:BEME 指南第 84 号
Med Teach. 2024 Apr;46(4):446-470. doi: 10.1080/0142159X.2024.2314198. Epub 2024 Feb 29.
8
Challenge, integration, and change: ChatGPT and future anatomical education.挑战、融合与变革:ChatGPT 与未来解剖学教育。
Med Educ Online. 2024 Dec 31;29(1):2304973. doi: 10.1080/10872981.2024.2304973. Epub 2024 Jan 13.
9
ChatGPT in the development of medical questionnaires. The example of the low back pain.ChatGPT在医学问卷开发中的应用。以腰痛为例。
Eur J Transl Myol. 2023 Dec 15;33(4):12114. doi: 10.4081/ejtm.2023.12114.
10
The potential role of ChatGPT and artificial intelligence in anatomy education: a conversation with ChatGPT.ChatGPT与人工智能在解剖学教育中的潜在作用:与ChatGPT的对话
Surg Radiol Anat. 2023 Oct;45(10):1321-1329. doi: 10.1007/s00276-023-03229-1. Epub 2023 Aug 16.