Ganapathy Arthi, Kaushal Parul
Department of Anatomy, Teaching Block, All India Institute of Medical Sciences, New Delhi, 110029 India.
Med Sci Educ. 2025 Feb 15;35(3):1295-1304. doi: 10.1007/s40670-025-02303-0. eCollection 2025 Jun.
The integration of AI chatbots into education has gained traction, particularly in medical fields such as anatomy. This study aims to evaluate and compare the responses of ChatGPT 4o mini and Gemini across different cognitive domains of anatomy education.
A cross-sectional study was conducted to assess the responses of these two AI chatbots to a set of anatomy questions selected from the Manual on Competency-Based Undergraduate Curriculum. Questions were categorized into knowledge, comprehension and application levels of cognitive domain. Responses were scored against an answer key prepared by anatomy experts. Relevant comparative statistical analysis was performed.
The overall performance of ChatGPT 4o mini (76.15%) was significantly superior to Gemini (72.84%). In application-level questions, ChatGPT 4o mini outperformed Gemini. Conversely, Gemini scored higher in comprehension-level questions (76.88% vs. 73.66%). Both chatbots exhibited factual inaccuracies and limitations in contextually accurate responses, particularly in application-level questions.
Both ChatGPT 4o mini and Gemini demonstrate potential as educational tools in anatomy, with strengths and limitations varying by cognitive domain. While AI chatbots can supplement traditional learning methods, they require ongoing refinement and validation. To ensure the responsible integration of AI into medical education, close attention must be devoted to faculty and student training, setting up relevant IT environment and ethical issues. Future research should focus on expanding question pools, incorporating user feedback and comparing with traditional educational approaches to enhance their effectiveness.
将人工智能聊天机器人整合到教育中已越来越受到关注,尤其是在解剖学等医学领域。本研究旨在评估和比较ChatGPT 4o mini和Gemini在解剖学教育不同认知领域的回答。
进行了一项横断面研究,以评估这两个人工智能聊天机器人对从基于能力的本科课程手册中选出的一组解剖学问题的回答。问题被分为认知领域的知识、理解和应用水平。根据解剖学专家准备的答案对回答进行评分。进行了相关的比较统计分析。
ChatGPT 4o mini的总体表现(76.15%)显著优于Gemini(72.84%)。在应用水平的问题上,ChatGPT 4o mini的表现优于Gemini。相反,Gemini在理解水平的问题上得分更高(76.88%对73.66%)。两个聊天机器人都存在事实不准确以及在上下文准确回答方面的局限性,尤其是在应用水平的问题上。
ChatGPT 4o mini和Gemini在解剖学教育中都显示出作为教育工具的潜力,其优势和局限性因认知领域而异。虽然人工智能聊天机器人可以补充传统学习方法,但它们需要不断完善和验证。为确保人工智能在医学教育中的合理整合,必须密切关注教师和学生培训、建立相关的信息技术环境以及伦理问题。未来的研究应侧重于扩大问题库、纳入用户反馈并与传统教育方法进行比较,以提高其有效性。