• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
ChatGPT as a prospective undergraduate and medical school student.ChatGPT 作为一名有志向的本科生和医学生。
PLoS One. 2024 Oct 23;19(10):e0308157. doi: 10.1371/journal.pone.0308157. eCollection 2024.
2
How Does ChatGPT Perform on the Italian Residency Admission National Exam Compared to 15,869 Medical Graduates?ChatGPT 在意大利住院医师入学国家考试中的表现如何?与 15869 名医学毕业生相比如何?
Ann Biomed Eng. 2024 Apr;52(4):745-749. doi: 10.1007/s10439-023-03318-7. Epub 2023 Jul 25.
3
Can achievement at medical admission tests predict future performance in postgraduate clinical assessments? A UK-based national cohort study.医学入学考试成绩能否预测研究生临床评估的未来表现?一项基于英国的全国队列研究。
BMJ Open. 2022 Feb 8;12(2):e056129. doi: 10.1136/bmjopen-2021-056129.
4
Predictive validity of the UKCAT for medical school undergraduate performance: a national prospective cohort study.英国临床能力倾向测验(UKCAT)对医学院本科学习成绩的预测效度:一项全国性前瞻性队列研究。
BMC Med. 2016 Sep 26;14(1):140. doi: 10.1186/s12916-016-0682-7.
5
The predictive validity of admission criteria for the results of clinical competency assessment with an emphasis on family medicine in the fifth year of medical education: an observational study.医学教育第五年临床能力评估结果入院标准的预测效度:一项观察性研究。
BMC Med Educ. 2022 Apr 12;22(1):269. doi: 10.1186/s12909-022-03293-y.
6
A Longitudinal Study of Commonly Used Admissions Measures and Disenrollment from Medical School and Graduate Medical Education Probation or Termination from Training.一项关于常用录取指标以及医学院退学、毕业后医学教育试用期或培训终止情况的纵向研究。
Mil Med. 2018 Nov 1;183(11-12):e680-e684. doi: 10.1093/milmed/usy069.
7
BMAT's predictive validity for medical school performance: A retrospective cohort study.BMAT 对医学院表现的预测效度:一项回顾性队列研究。
Med Educ. 2022 Sep;56(9):936-948. doi: 10.1111/medu.14819. Epub 2022 May 16.
8
Performance of ChatGPT on UK Standardized Admission Tests: Insights From the BMAT, TMUA, LNAT, and TSA Examinations.ChatGPT在英国标准化入学考试中的表现:来自生物医学入学考试、大学数学入学测试、全国法律入学考试和思维技能评估考试的见解
JMIR Med Educ. 2023 Apr 26;9:e47737. doi: 10.2196/47737.
9
Assessing the Ability of a Large Language Model to Score Free-Text Medical Student Clinical Notes: Quantitative Study.评估大型语言模型对自由文本医学生临床笔记评分的能力:定量研究。
JMIR Med Educ. 2024 Jul 25;10:e56342. doi: 10.2196/56342.
10
Predictive power of high school educational attainment and the medical aptitude test for performance during the Bachelor program in human medicine at the University of Bern: a cohort study.高中教育程度和医学能力倾向测试对伯尔尼大学人类医学学士课程表现的预测能力:一项队列研究。
Swiss Med Wkly. 2020 Dec 31;150:w20389. doi: 10.4414/smw.2020.20389. eCollection 2020 Dec 14.

引用本文的文献

1
A Quantum-like Approach to Semantic Text Classification.一种类量子方法用于语义文本分类。
Entropy (Basel). 2025 Jul 19;27(7):767. doi: 10.3390/e27070767.

本文引用的文献

1
Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.评估印度全国医预考用大型语言模型:GPT-3.5、GPT-4 和 Bard 的比较分析。
JMIR Med Educ. 2024 Feb 21;10:e51523. doi: 10.2196/51523.
2
Performance of ChatGPT on UK Standardized Admission Tests: Insights From the BMAT, TMUA, LNAT, and TSA Examinations.ChatGPT在英国标准化入学考试中的表现:来自生物医学入学考试、大学数学入学测试、全国法律入学考试和思维技能评估考试的见解
JMIR Med Educ. 2023 Apr 26;9:e47737. doi: 10.2196/47737.
3
Representation learning: a review and new perspectives.表示学习:综述与新视角。
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828. doi: 10.1109/TPAMI.2013.50.
4
The predictive validity of the BioMedical Admissions Test for pre-clinical examination performance.生物医学入学考试对临床前考试表现的预测效度。
Med Educ. 2009 Jun;43(6):557-64. doi: 10.1111/j.1365-2923.2009.03367.x.

ChatGPT 作为一名有志向的本科生和医学生。

ChatGPT as a prospective undergraduate and medical school student.

机构信息

University of Cagliari, Cagliari, Italy.

出版信息

PLoS One. 2024 Oct 23;19(10):e0308157. doi: 10.1371/journal.pone.0308157. eCollection 2024.

DOI:10.1371/journal.pone.0308157
PMID:39441779
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11498684/
Abstract

This article reports the results of an experiment conducted with ChatGPT to see how its performance compares to human performance on tests that require specific knowledge and skills, such as university admission tests. We chose a general undergraduate admission test and two tests for admission to biomedical programs: the Scholastic Assessment Test (SAT), the Cambridge BioMedical Admission Test (BMAT), and the Italian Medical School Admission Test (IMSAT). In particular, we looked closely at the difference in performance between ChatGPT-4 and its predecessor, ChatGPT-3.5, to assess its evolution. The performance of ChatGPT-4 showed a significant improvement over ChatGPT-3.5 and, compared to real students, was on average within the top 10% in the SAT test, while the score in the IMSAT test granted admission to the two highest ranked Italian medical schools. In addition to the performance analysis, we provide a qualitative analysis of incorrect answers and a classification of three different types of logical and computational errors made by ChatGPT-4, which reveal important weaknesses of the model. This provides insight into the skills needed to use these models effectively despite their weaknesses, and also suggests possible applications of our analysis in the field of education.

摘要

这篇文章报告了使用 ChatGPT 进行的实验结果,以了解其在需要特定知识和技能的测试中的表现如何,例如大学入学考试。我们选择了一项普通本科入学考试和两项生物医学专业入学考试:学术能力评估测试(SAT)、剑桥生物医学入学考试(BMAT)和意大利医学院入学考试(IMSAT)。特别是,我们仔细研究了 ChatGPT-4 与其前身 ChatGPT-3.5 之间的性能差异,以评估其进化情况。ChatGPT-4 的表现明显优于 ChatGPT-3.5,与真实学生相比,在 SAT 考试中平均成绩在前 10%之列,而 IMSAT 考试的成绩则被两所排名最高的意大利医学院录取。除了性能分析,我们还对错误答案进行了定性分析,并对 ChatGPT-4 犯的三种不同类型的逻辑和计算错误进行了分类,这揭示了该模型的重要弱点。这为我们提供了一些见解,即尽管存在弱点,但仍需要使用这些模型的技能,也为我们的分析在教育领域的可能应用提供了思路。