Imperial College School of Medicine, Imperial College London, London, UK.
Wellcome Trust-MRC Institute of Metabolic Science, University of Cambridge, Cambridge, UK.
Med Teach. 2021 Mar;43(3):341-346. doi: 10.1080/0142159X.2020.1845909. Epub 2020 Nov 16.
The forthcoming UK Medical Licensing Assessment will require all medical schools in the UK to ensure that their students pass an appropriately designed Clinical and Professional Skills Assessment (CPSA) prior to graduation and registration with a licence to practice medicine. The requirements for the CPSA will be set by the General Medical Council, but individual medical schools will be responsible for implementing their own assessments. It is therefore important that assessors from different medical schools across the UK agree on what standard of performance constitutes a fail, pass or good grade.
We used an experimental video-based, single-blinded, randomised, internet-based design. We created videos of simulated student performances of a clinical examination at four scripted standards: clear fail (CF), borderline (BD), clear pass (CPX) and good (GD). Assessors from ten regions across the UK were randomly assigned to watch five videos in 12 different combinations and asked to give competence domain scores and an overall global grade for each simulated candidate. The inter-rater agreement as measured by the intraclass correlation coefficient (ICC) based on a two-way random-effects model for absolute agreement was calculated for the total domain scores.
120 assessors enrolled in the study, with 98 eligible for analysis. The ICC was 0.93 (95% CI 0.81-0.99). The mean percentage agreement with the scripted global grade was 74.4% (range 40.8-96.9%).
The inter-rater agreement amongst assessors across the UK when rating simulated candidates performing at scripted levels is excellent. The level of agreement for the overall global performance level for simulated candidates is also high. These findings suggest that assessors from across the UK viewing the same simulated performances show high levels of agreement of the standards expected of students at a 'clear fail,' 'borderline,' 'clear pass' and 'good' level.
即将到来的英国医疗执照评估要求英国所有医学院确保其学生在毕业并获得行医执照之前通过适当设计的临床和专业技能评估(CPSA)。CPSA 的要求将由英国医学总会设定,但各医学院将负责实施自己的评估。因此,英国各地不同医学院的评估员就何种表现标准构成不及格、及格或良好成绩达成一致意见非常重要。
我们使用了基于实验视频的、单盲的、随机的、基于互联网的设计。我们创建了四个脚本标准的模拟学生临床检查表现的视频:明确不及格(CF)、边缘(BD)、明确及格(CPX)和良好(GD)。来自英国十个地区的评估员被随机分配观看 12 种不同组合的 5 个视频,并被要求为每个模拟候选人的能力领域评分和整体全球成绩。采用基于双向随机效应模型的组内相关系数(ICC)来计算基于绝对一致的总领域评分的组内评分一致性。
120 名评估员参加了研究,其中 98 名符合分析条件。ICC 为 0.93(95%CI 0.81-0.99)。与脚本全球等级的平均百分比一致性为 74.4%(范围为 40.8-96.9%)。
在对按脚本水平表现的模拟候选人进行评分时,英国各地评估员之间的组内评分一致性非常好。对模拟候选人的整体全球表现水平的一致性也很高。这些发现表明,观看相同模拟表现的英国各地评估员对学生在“明确不及格”、“边缘”、“明确及格”和“良好”水平的期望标准表现出高度一致的评估。