Khan Uzma, Khan Yasir Naseem
Department of Clinical Sciences, College of Medicine, Al Rayan National Colleges, Medina Al-Munawara, Saudi Arabia.
Department of Basic Sciences, College of Medicine, Al Rayan National Colleges, Medina Al-Munawara, Saudi Arabia.
J Educ Eval Health Prof. 2025;22:19. doi: 10.3352/jeehp.2025.22.19. Epub 2025 Jun 19.
This study investigated the correlation between task-based checklist scores and global rating scores (GRS) in objective structured clinical examinations (OSCEs) for fourth-year undergraduate medical students and aimed to determine whether both methods can be reliably used in a standard setting.
A comparative observational study was conducted at Al Rayan College of Medicine, Saudi Arabia, involving 93 fourth-year students during the 2023-2024 academic year. OSCEs from 2 General Practice courses were analyzed, each comprising 10 stations assessing clinical competencies. Students were scored using both task-specific checklists and holistic 5-point GRS. Reliability was evaluated using Cronbach's α, and the relationship between the 2 scoring methods was assessed using the coefficient of determination (R2). Ethical approval and informed consent were obtained.
The mean OSCE score was 76.7 in Course 1 (Cronbach's α=0.85) and 73.0 in Course 2 (Cronbach's α=0.81). R2 values varied by station and competency. Strong correlations were observed in procedural and management skills (R2 up to 0.87), while weaker correlations appeared in history-taking stations (R2 as low as 0.35). The variability across stations highlighted the context-dependence of alignment between checklist and GRS methods.
Both checklists and GRS exhibit reliable psychometric properties. Their combined use improves validity in OSCE scoring, but station-specific application is recommended. Checklists may anchor pass/fail decisions, while GRS may assist in assessing borderline performance. This hybrid model increases fairness and reflects clinical authenticity in competency-based assessment.
本研究调查了本科四年级医学生客观结构化临床考试(OSCE)中基于任务的检查表评分与整体评分(GRS)之间的相关性,旨在确定这两种方法在标准设定中是否都能可靠使用。
在沙特阿拉伯的阿尔拉扬医学院进行了一项比较观察性研究,涉及2023 - 2024学年的93名四年级学生。分析了2门全科医学课程的OSCE,每门课程包括10个评估临床能力的站点。使用特定任务检查表和整体5分制GRS对学生进行评分。使用克朗巴哈α系数评估信度,并使用决定系数(R²)评估两种评分方法之间的关系。获得了伦理批准和知情同意。
课程1的OSCE平均成绩为76.7(克朗巴哈α = 0.85),课程2为73.0(克朗巴哈α = 0.81)。R²值因站点和能力而异。在操作和管理技能方面观察到强相关性(R²高达0.87),而在病史采集站点相关性较弱(R²低至0.35)。各站点之间的变异性突出了检查表和GRS方法之间一致性的情境依赖性。
检查表和GRS都具有可靠的心理测量特性。它们的联合使用提高了OSCE评分的有效性,但建议根据具体站点应用。检查表可作为通过/失败决策的依据,而GRS可协助评估临界表现。这种混合模型提高了公平性,并在基于能力的评估中反映了临床真实性。