使用机器学习方法对医学生在高风险考试中的表现进行早期预测。

Early prediction of medical students' performance in high-stakes examinations using machine learning approaches.

作者信息

Mastour Haniye, Dehghani Toktam, Moradi Ehsan, Eslami Saeid

机构信息

Department of Medical Education, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.

Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.

出版信息

Heliyon. 2023 Jul 13;9(7):e18248. doi: 10.1016/j.heliyon.2023.e18248. eCollection 2023 Jul.

DOI:10.1016/j.heliyon.2023.e18248

PMID:37519702

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10372649/

Abstract

INTRODUCTION

Since the advent of medical education systems, managing high-stakes exams has been a top priority and challenge for all policymakers. However, considering machine learning (ML) techniques as a replacement for medical licensing examinations, particularly during crises such as the COVID-19 outbreak, could be an effective solution. This study uses ML models to develop a framework for predicting medical students' performance on high-stakes exams, such as the Comprehensive Medical Basic Sciences Examination (CMBSE).

MATERIAL AND METHODS

Prediction of students' status and score on high-stakes examinations faces several challenges, including an imbalanced number of failing and passing students, a large number of heterogeneous and complex features, and the need to identify at-risk and top-performing students. In this study, two major categories of ML approaches are compared: first, classic models (logistic regression (LR), support vector machine (SVM), and k-nearest neighbors (KNN)), and second, ensemble models (voting, bagging (BG), random forests (RF), adaptive boosting (ADA), extreme gradient boosting (XGB), and stacking).

RESULTS

To evaluate the models' discrimination ability, they are assessed using a real dataset containing information on medical students over a five-year period (n = 1005). The findings indicate that ensemble ML models demonstrate optimal performance in predicting CMBSE status (RF and stacking). Similarly, among the classic regressors, LR exhibited the highest root-mean-square deviation (RMSD) (0.134) and coefficient of determination (R2) (0.62), whereas the RF model had the highest RMSD (0.077) and R2 (0.80) overall. Furthermore, Anatomical Sciences, Biochemistry, Parasitology, and Entomology grade point average (GPA) and grades demonstrated the strongest positive correlation with the outcomes.

CONCLUSION

Comparing classic and ensemble ML models revealed that ensemble models are superior to classic models. Therefore, the presented framework could be considered a suitable alternative for the CMBSE and other comparable medical licensing examinations.

摘要

引言

自医学教育体系出现以来，管理高风险考试一直是所有政策制定者的首要任务和挑战。然而，将机器学习（ML）技术作为医学执照考试的替代方案，特别是在诸如新冠疫情爆发等危机期间，可能是一种有效的解决方案。本研究使用ML模型开发了一个框架，用于预测医学生在高风险考试中的表现，如综合医学基础科学考试（CMBSE）。

材料与方法

预测学生在高风险考试中的状态和分数面临着几个挑战，包括及格和不及格学生数量不均衡、大量异质和复杂的特征，以及识别有风险和表现优异的学生的需求。在本研究中，比较了两类主要的ML方法：第一类是经典模型（逻辑回归（LR）、支持向量机（SVM）和k近邻（KNN）），第二类是集成模型（投票、装袋（BG）、随机森林（RF）、自适应提升（ADA）、极端梯度提升（XGB）和堆叠）。

结果

为了评估模型的辨别能力，使用了一个包含五年内医学生信息的真实数据集（n = 1005）对模型进行评估。结果表明，集成ML模型在预测CMBSE状态方面表现最佳（RF和堆叠）。同样，在经典回归模型中，LR的均方根偏差（RMSD）最高（0.134），决定系数（R2）最高（0.62），而RF模型总体上RMSD最高（0.077），R2最高（0.80）。此外，解剖学、生物化学、寄生虫学和昆虫学的平均绩点（GPA）和成绩与结果呈现出最强的正相关。

结论

比较经典和集成ML模型发现，集成模型优于经典模型。因此，所提出的框架可被视为CMBSE和其他类似医学执照考试的合适替代方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de97/10372649/9e6bd621c0fa/ga1.jpg

相似文献

Early prediction of medical students' performance in high-stakes examinations using machine learning approaches.使用机器学习方法对医学生在高风险考试中的表现进行早期预测。

Heliyon. 2023 Jul 13;9(7):e18248. doi: 10.1016/j.heliyon.2023.e18248. eCollection 2023 Jul.

Prediction of medical sciences students' performance on high-stakes examinations using machine learning models: a protocol for a systematic review.基于机器学习模型预测医学生在高风险考试中的表现：系统评价方案。

BMJ Open. 2023 May 4;13(5):e064956. doi: 10.1136/bmjopen-2022-064956.

Development, comparison, and internal validation of prediction models to determine the visual prognosis of patients with open globe injuries using machine learning approaches.运用机器学习方法开发、比较和内部验证预测模型，以确定开放性眼球损伤患者的视觉预后。

BMC Med Inform Decis Mak. 2024 May 21;24(1):131. doi: 10.1186/s12911-024-02520-4.

Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.利用电子病历数据构建机器学习模型的联合建模策略：以脑出血为例。

BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.

A GA-stacking ensemble approach for forecasting energy consumption in a smart household: A comparative study of ensemble methods.基于 GA 堆叠的智能家居能耗预测集成方法研究：集成方法比较

J Environ Manage. 2024 Jul;364:121264. doi: 10.1016/j.jenvman.2024.121264. Epub 2024 Jun 12.

A comparative study of explainable ensemble learning and logistic regression for predicting in-hospital mortality in the emergency department.一种用于预测急诊科住院死亡率的可解释集成学习与逻辑回归的对比研究。

Sci Rep. 2024 Feb 10;14(1):3406. doi: 10.1038/s41598-024-54038-4.

Establishment and validation of an interactive artificial intelligence platform to predict postoperative ambulatory status for patients with metastatic spinal disease: a multicenter analysis.建立和验证交互式人工智能平台，以预测转移性脊柱疾病患者的术后活动状态：一项多中心分析。

Int J Surg. 2024 May 1;110(5):2738-2756. doi: 10.1097/JS9.0000000000001169.

A Risk Prediction Model for Physical Restraints Among Older Chinese Adults in Long-term Care Facilities: Machine Learning Study.长期护理机构中老年人身体约束的风险预测模型：机器学习研究。

J Med Internet Res. 2023 Apr 6;25:e43815. doi: 10.2196/43815.

Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review.COVID-19大流行期间临床护理中的人工智能：一项系统综述。

Comput Struct Biotechnol J. 2021;19:2833-2850. doi: 10.1016/j.csbj.2021.05.010. Epub 2021 May 7.

Hybrid ensemble-based machine learning model for predicting phosphorus concentrations in hydroponic solution.基于混合集成的机器学习模型用于预测水培溶液中的磷浓度。

Spectrochim Acta A Mol Biomol Spectrosc. 2024 Jan 5;304:123327. doi: 10.1016/j.saa.2023.123327. Epub 2023 Sep 1.

引用本文的文献

Machine learning-based academic performance prediction with explainability for enhanced decision-making in educational institutions.基于机器学习的学术成绩预测及其可解释性，以增强教育机构中的决策制定。

Sci Rep. 2025 Jul 24;15(1):26879. doi: 10.1038/s41598-025-12353-4.

Explainable artificial intelligence for predicting medical students' performance in comprehensive assessments.用于预测医学生综合评估表现的可解释人工智能

Sci Rep. 2025 Jul 3;15(1):23752. doi: 10.1038/s41598-025-07460-1.

Examining the empathy levels of medical students using CHAID analysis.使用CHAID分析检查医学生的共情水平。

BMC Med Educ. 2025 May 19;25(1):726. doi: 10.1186/s12909-025-07296-3.

Exploring the acceptance of e-learning in health professions education in Iran based on the technology acceptance model (TAM).基于技术接受模型（TAM）探索伊朗卫生专业教育中对电子学习的接受情况。

Sci Rep. 2025 Mar 10;15(1):8178. doi: 10.1038/s41598-025-90742-5.

本文引用的文献

Conducting a high-stakes OSCE in a COVID-19 environment.在新冠疫情环境下开展高风险客观结构化临床考试。

MedEdPublish (2016). 2020 Mar 27;9:54. doi: 10.15694/mep.2020.000054.1. eCollection 2020.

Assessing the Impact of Changes to USMLE Step 1 Grading on Evaluation of Neurosurgery Residency Applicants in the United States: A Program Director Survey.评估美国医师执照考试（USMLE）第 1 阶段评分变化对美国神经外科学住院医师申请人评估的影响：一项主任调查。

World Neurosurg. 2022 Oct;166:e511-e520. doi: 10.1016/j.wneu.2022.07.045. Epub 2022 Jul 16.

Perceived Impact of USMLE Step 1 Score Reporting to Pass/Fail on Otolaryngology Applicant Selection.美国医师执照考试（USMLE）第 1 步成绩报告为通过/不通过对耳鼻喉科申请人选择的影响的感知。

Ann Otol Rhinol Laryngol. 2022 May;131(5):506-511. doi: 10.1177/00034894211028436. Epub 2021 Jul 1.

Artificial Intelligence and Machine Learning to Predict Student Performance during the COVID-19.人工智能与机器学习用于预测新冠疫情期间的学生表现。

Procedia Comput Sci. 2021;184:835-840. doi: 10.1016/j.procs.2021.03.104. Epub 2021 May 18.

Early prediction of the risk of scoring lower than 500 on the COMLEX 1.COMLEX-1 得分低于 500 风险的早期预测。

BMC Med Educ. 2021 Jan 21;21(1):70. doi: 10.1186/s12909-021-02501-5.

The Change of USMLE Step 1 to Pass/Fail: Perspectives of the Surgery Program Director.美国医师执照考试第 1 阶段改为通过/不通过：外科项目主任的观点。

J Surg Educ. 2021 Jan-Feb;78(1):91-98. doi: 10.1016/j.jsurg.2020.06.034. Epub 2020 Jul 10.

From Local Explanations to Global Understanding with Explainable AI for Trees.利用可解释人工智能实现从局部解释到树木的全局理解

Nat Mach Intell. 2020 Jan;2(1):56-67. doi: 10.1038/s42256-019-0138-9. Epub 2020 Jan 17.

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数（MCC）在二分类评估中优于 F1 得分和准确率的优势。

BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.

User's guide to correlation coefficients.相关系数用户指南。

Turk J Emerg Med. 2018 Aug 7;18(3):91-93. doi: 10.1016/j.tjem.2018.08.001. eCollection 2018 Sep.

The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.在不平衡数据集上评估二元分类器时，精确率-召回率曲线比ROC曲线更具信息性。

PLoS One. 2015 Mar 4;10(3):e0118432. doi: 10.1371/journal.pone.0118432. eCollection 2015.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用机器学习方法对医学生在高风险考试中的表现进行早期预测。

Early prediction of medical students' performance in high-stakes examinations using machine learning approaches.

作者信息

机构信息

出版信息

INTRODUCTION

MATERIAL AND METHODS

RESULTS

CONCLUSION

引言

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献