架起人工智能与医学专业知识的桥梁：ChatGPT在西班牙医学专科住院医师入学考试中的成功。

Bridging AI and Medical Expertise: ChatGPT's Success on the Medical Specialization Residency Admission Exam in Spain.

作者信息

Leis Angela, Mayer Miguel-Angel, Mayer Alex

机构信息

Hospital del Mar Research Institute, Barcelona, Spain.

Hospital del Mar, Barcelona, Spain.

出版信息

Stud Health Technol Inform. 2025 May 15;327:1054-1058. doi: 10.3233/SHTI250544.

DOI:10.3233/SHTI250544

PMID:40380650

Abstract

The growing use of Artificial Intelligence (AI) in healthcare, particularly focusing on the potential of generative AI models like ChatGPT-4 is a trending topic. The study examines how ChatGPT-4 performed on the national Medicine Residency exam in Spain, a highly selective test for accessing the medical specialization training program called MIR. ChatGPT-4 answered 210 questions, including 25 that required image interpretation. The chatbot correctly answered 150 out of 200 questions, achieving an estimated ranking of around 1900-2300 out of 11,577 candidates. This performance would allow access to most medical specialties in Spain. No significant differences were found between questions requiring image analysis and those that did not, but ChatGPT struggled with more difficult questions, showing a higher error rate for complex problems just like a human being. Despite its potential as an educational and problem-solving tool, the study highlights ChatGPT's limitations, including occasional "AI hallucinations" (incorrect or nonsensical answers) and variability in responses when questions were repeated. The study emphasizes that while AI tools such as ChatGPT can assist in education and medical tasks, they cannot replace qualified healthcare professionals, and their output requires careful verification.

摘要

人工智能（AI）在医疗保健领域的应用日益广泛，尤其是生成式AI模型如ChatGPT-4的潜力成为热门话题。该研究考察了ChatGPT-4在西班牙国家医学住院医师考试中的表现，这是一项进入名为MIR的医学专科培训项目的高选拔性考试。ChatGPT-4回答了210道问题，其中包括25道需要图像解读的问题。这个聊天机器人在200道问题中正确回答了150道，在11577名考生中估计排名约为1900 - 2300名。这样的成绩在西班牙可以进入大多数医学专科。在需要图像分析的问题和不需要图像分析的问题之间未发现显著差异，但ChatGPT在较难的问题上表现吃力，对于复杂问题的错误率和人类一样更高。尽管ChatGPT有作为教育和解决问题工具的潜力，但该研究强调了它的局限性，包括偶尔出现的“AI幻觉”（错误或无意义的答案）以及重复提问时回答的变异性。该研究强调，虽然像ChatGPT这样的AI工具可以协助教育和医疗任务，但它们无法取代合格的医疗保健专业人员，其输出结果需要仔细验证。

相似文献

Bridging AI and Medical Expertise: ChatGPT's Success on the Medical Specialization Residency Admission Exam in Spain.

Stud Health Technol Inform. 2025 May 15;327:1054-1058. doi: 10.3233/SHTI250544.

Artificial Intelligence in Orthopaedics: Performance of ChatGPT on Text and Image Questions on a Complete AAOS Orthopaedic In-Training Examination (OITE).

J Surg Educ. 2024 Nov;81(11):1645-1649. doi: 10.1016/j.jsurg.2024.08.002. Epub 2024 Sep 14.

ChatGPT, Bard, and Bing Chat Are Large Language Processing Models That Answered Orthopaedic In-Training Examination Questions With Similar Accuracy to First-Year Orthopaedic Surgery Residents.

Arthroscopy. 2025 Mar;41(3):557-562. doi: 10.1016/j.arthro.2024.08.023. Epub 2024 Aug 28.

Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam.

Int J Nurs Stud. 2024 May;153:104717. doi: 10.1016/j.ijnurstu.2024.104717. Epub 2024 Feb 8.

Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.

JMIR Med Educ. 2024 Feb 9;10:e48514. doi: 10.2196/48514.

Health profession students' perceptions of ChatGPT in healthcare and education: insights from a mixed-methods study.

BMC Med Educ. 2025 Jan 21;25(1):98. doi: 10.1186/s12909-025-06702-0.

Evaluating the application of ChatGPT in China's residency training education: An exploratory study.

Med Teach. 2025 May;47(5):858-864. doi: 10.1080/0142159X.2024.2377808. Epub 2024 Jul 12.

ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.

Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.

Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge.

Graefes Arch Clin Exp Ophthalmol. 2025 Feb;263(2):527-536. doi: 10.1007/s00417-024-06625-4. Epub 2024 Sep 15.

Performance of Chatgpt in ophthalmology exam; human versus AI.

Int Ophthalmol. 2024 Nov 6;44(1):413. doi: 10.1007/s10792-024-03353-w.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

架起人工智能与医学专业知识的桥梁：ChatGPT在西班牙医学专科住院医师入学考试中的成功。

Bridging AI and Medical Expertise: ChatGPT's Success on the Medical Specialization Residency Admission Exam in Spain.

作者信息

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献