Mastour Haniye, Dehghani Toktam, Moradi Ehsan, Eslami Saeid
Department of Medical Education, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
Heliyon. 2023 Jul 13;9(7):e18248. doi: 10.1016/j.heliyon.2023.e18248. eCollection 2023 Jul.
Since the advent of medical education systems, managing high-stakes exams has been a top priority and challenge for all policymakers. However, considering machine learning (ML) techniques as a replacement for medical licensing examinations, particularly during crises such as the COVID-19 outbreak, could be an effective solution. This study uses ML models to develop a framework for predicting medical students' performance on high-stakes exams, such as the Comprehensive Medical Basic Sciences Examination (CMBSE).
Prediction of students' status and score on high-stakes examinations faces several challenges, including an imbalanced number of failing and passing students, a large number of heterogeneous and complex features, and the need to identify at-risk and top-performing students. In this study, two major categories of ML approaches are compared: first, classic models (logistic regression (LR), support vector machine (SVM), and k-nearest neighbors (KNN)), and second, ensemble models (voting, bagging (BG), random forests (RF), adaptive boosting (ADA), extreme gradient boosting (XGB), and stacking).
To evaluate the models' discrimination ability, they are assessed using a real dataset containing information on medical students over a five-year period (n = 1005). The findings indicate that ensemble ML models demonstrate optimal performance in predicting CMBSE status (RF and stacking). Similarly, among the classic regressors, LR exhibited the highest root-mean-square deviation (RMSD) (0.134) and coefficient of determination (R2) (0.62), whereas the RF model had the highest RMSD (0.077) and R2 (0.80) overall. Furthermore, Anatomical Sciences, Biochemistry, Parasitology, and Entomology grade point average (GPA) and grades demonstrated the strongest positive correlation with the outcomes.
Comparing classic and ensemble ML models revealed that ensemble models are superior to classic models. Therefore, the presented framework could be considered a suitable alternative for the CMBSE and other comparable medical licensing examinations.
自医学教育体系出现以来,管理高风险考试一直是所有政策制定者的首要任务和挑战。然而,将机器学习(ML)技术作为医学执照考试的替代方案,特别是在诸如新冠疫情爆发等危机期间,可能是一种有效的解决方案。本研究使用ML模型开发了一个框架,用于预测医学生在高风险考试中的表现,如综合医学基础科学考试(CMBSE)。
预测学生在高风险考试中的状态和分数面临着几个挑战,包括及格和不及格学生数量不均衡、大量异质和复杂的特征,以及识别有风险和表现优异的学生的需求。在本研究中,比较了两类主要的ML方法:第一类是经典模型(逻辑回归(LR)、支持向量机(SVM)和k近邻(KNN)),第二类是集成模型(投票、装袋(BG)、随机森林(RF)、自适应提升(ADA)、极端梯度提升(XGB)和堆叠)。
为了评估模型的辨别能力,使用了一个包含五年内医学生信息的真实数据集(n = 1005)对模型进行评估。结果表明,集成ML模型在预测CMBSE状态方面表现最佳(RF和堆叠)。同样,在经典回归模型中,LR的均方根偏差(RMSD)最高(0.134),决定系数(R2)最高(0.62),而RF模型总体上RMSD最高(0.077),R2最高(0.80)。此外,解剖学、生物化学、寄生虫学和昆虫学的平均绩点(GPA)和成绩与结果呈现出最强的正相关。
比较经典和集成ML模型发现,集成模型优于经典模型。因此,所提出的框架可被视为CMBSE和其他类似医学执照考试的合适替代方案。