• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

不同基于人工智能的聊天机器人生成的医学成像选择题的适用性、难度和区分指数比较。

Comparison of applicability, difficulty, and discrimination indices of multiple-choice questions on medical imaging generated by different AI-based chatbots.

作者信息

Karahan B N, Emekli E

机构信息

Department of Radiology, Eskişehir Osmangazi University, Faculty of Medicine, Eskişehir, Türkiye.

Department of Radiology, Eskişehir Osmangazi University, Faculty of Medicine, Eskişehir, Türkiye; Translational Medicine Application and Research Center, Eskişehir Osmangazi University, Eskişehir, Türkiye.

出版信息

Radiography (Lond). 2025 Jul 16;31(5):103087. doi: 10.1016/j.radi.2025.103087.

DOI:10.1016/j.radi.2025.103087
PMID:40674889
Abstract

INTRODUCTION

Creating high-quality multiple-choice questions (MCQs) is vital in health education, particularly in fields like medical imaging. AI-based chatbots have emerged as a tool to automate this process. This study evaluates the applicability, difficulty, and discrimination indices of MCQs generated by various AI chatbots for medical imaging education.

METHODS

80 MCQs were generated by seven AI-based chatbots (Claude 3, Claude 3.5, ChatGPT-3.5, ChatGPT-4.0, Copilot, Gemini, Turin Q, and Writesonic) using lecture materials. These questions were evaluated for relevance, accuracy, and originality by radiology faculty, and then administered to 56 students and 12 research assistants. The questions were analyzed using Miller's Pyramid to assess cognitive levels, with difficulty and discrimination indices calculated.

DISCUSSION

AI-based chatbots generated MCQs suitable for medical imaging education, with 72.5 % of the questions deemed appropriate. Most questions assessed recall (79.31 %), suggesting that AI models excel at generating basic knowledge questions but struggle with higher cognitive skills. Differences in question quality were noted between chatbots, with Claude 3 being the most reliable. The difficulty index averaged 0.62, indicating a moderate level of difficulty, but some models produced easier questions.

CONCLUSION

AI chatbots show promise for automating MCQ creation in health education, though most questions focus on recall. For AI to fully support health education, further development is needed to improve question quality, especially in higher cognitive domains.

IMPLICATION FOR PRACTICE

AI-based chatbots can support educators in generating MCQs, especially for assessing basic knowledge in medical imaging. While useful for saving time, expert review remains essential to ensure question quality and to address higher-level cognitive skills. Integrating AI tools into assessment workflows may enhance efficiency, provided there is appropriate oversight.

摘要

引言

创建高质量的多项选择题在健康教育中至关重要,尤其是在医学成像等领域。基于人工智能的聊天机器人已成为自动化这一过程的工具。本研究评估了各种人工智能聊天机器人生成的多项选择题在医学成像教育中的适用性、难度和区分度指标。

方法

七个基于人工智能的聊天机器人(Claude 3、Claude 3.5、ChatGPT - 3.5、ChatGPT - 4.0、Copilot、Gemini、Turin Q和Writesonic)利用讲座材料生成了80道多项选择题。放射科教员对这些问题的相关性、准确性和原创性进行了评估,然后将其施测于56名学生和12名研究助理。使用米勒金字塔分析这些问题以评估认知水平,并计算难度和区分度指标。

讨论

基于人工智能的聊天机器人生成了适用于医学成像教育的多项选择题,72.5%的问题被认为是合适的。大多数问题考查回忆(79.31%),这表明人工智能模型在生成基础知识问题方面表现出色,但在更高认知技能方面存在困难。注意到不同聊天机器人生成的问题质量存在差异,Claude 3是最可靠的。难度指数平均为0.62,表明难度适中,但一些模型生成的问题较简单。

结论

人工智能聊天机器人在健康教育中自动创建多项选择题方面显示出前景,不过大多数问题集中在回忆方面。为使人工智能充分支持健康教育,需要进一步发展以提高问题质量,特别是在更高认知领域。

对实践的启示

基于人工智能的聊天机器人可以帮助教育工作者生成多项选择题,特别是用于评估医学成像中的基础知识。虽然有助于节省时间,但专家审查对于确保问题质量和解决更高层次的认知技能仍然至关重要。将人工智能工具整合到评估工作流程中可能会提高效率,前提是有适当的监督。

相似文献

1
Comparison of applicability, difficulty, and discrimination indices of multiple-choice questions on medical imaging generated by different AI-based chatbots.不同基于人工智能的聊天机器人生成的医学成像选择题的适用性、难度和区分指数比较。
Radiography (Lond). 2025 Jul 16;31(5):103087. doi: 10.1016/j.radi.2025.103087.
2
AI in radiography education: Evaluating multiple-choice questions difficulty and discrimination.放射学教育中的人工智能:评估多项选择题的难度和区分度。
J Med Imaging Radiat Sci. 2025 Mar 28;56(4):101896. doi: 10.1016/j.jmir.2025.101896.
3
Artificial intelligence in radiology examinations: a psychometric comparison of question generation methods.放射学检查中的人工智能:问题生成方法的心理测量学比较
Diagn Interv Radiol. 2025 Jul 21. doi: 10.4274/dir.2025.253407.
4
Examining the Role of Artificial Intelligence in Assessment: A Comparative Study of ChatGPT and Educator-Generated Multiple-Choice Questions in a Dental Exam.审视人工智能在评估中的作用:ChatGPT与教育工作者生成的牙科考试多项选择题的比较研究
Eur J Dent Educ. 2025 Aug 10. doi: 10.1111/eje.70034.
5
Comparative performance of ChatGPT, Gemini, and final-year emergency medicine clerkship students in answering multiple-choice questions: implications for the use of AI in medical education.ChatGPT、Gemini与急诊医学实习最后一年学生在回答多项选择题方面的表现比较:人工智能在医学教育中的应用启示
Int J Emerg Med. 2025 Aug 7;18(1):146. doi: 10.1186/s12245-025-00949-6.
6
Five advanced chatbots solving European Diploma in Radiology (EDiR) text-based questions: differences in performance and consistency.五个解决欧洲放射学文凭(EDiR)基于文本问题的先进聊天机器人:性能和一致性的差异。
Eur Radiol Exp. 2025 Aug 19;9(1):79. doi: 10.1186/s41747-025-00591-0.
7
Performance of ChatGPT-4 Omni and Gemini 1.5 Pro on Ophthalmology-Related Questions in the Turkish Medical Specialty Exam.ChatGPT-4 Omni和Gemini 1.5 Pro在土耳其医学专业考试中与眼科相关问题上的表现。
Turk J Ophthalmol. 2025 Aug 21;55(4):177-185. doi: 10.4274/tjo.galenos.2025.27895.
8
Information from digital and human sources: A comparison of chatbot and clinician responses to orthodontic questions.来自数字和人工来源的信息:聊天机器人与临床医生对正畸问题回答的比较。
Am J Orthod Dentofacial Orthop. 2025 May 6. doi: 10.1016/j.ajodo.2025.04.008.
9
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
10
Quality of Human Expert versus Large Language Model Generated Multiple Choice Questions in the Field of Mechanical Ventilation.人工专家与大语言模型生成的机械通气领域多项选择题的质量
Chest. 2025 Jul 18. doi: 10.1016/j.chest.2025.07.005.