Suppr超能文献

开发一种基于视频的方法,以比较和调整完全嵌套 OSCE 中的考官效应。

Developing a video-based method to compare and adjust examiner effects in fully nested OSCEs.

机构信息

Medical School Education Research Group (MERG), Keele University School of Medicine, Keele, UK.

Department of Acute Medicine, Fairfield General Hospital, Pennine Acute Hospitals NHS Trust, Bury, UK.

出版信息

Med Educ. 2019 Mar;53(3):250-263. doi: 10.1111/medu.13783. Epub 2018 Dec 21.

Abstract

BACKGROUND

Although averaging across multiple examiners' judgements reduces unwanted overall score variability in objective structured clinical examinations (OSCE), designs involving several parallel circuits of the OSCE require that different examiner cohorts collectively judge performances to the same standard in order to avoid bias. Prior research suggests the potential for important examiner-cohort effects in distributed or national examinations that could compromise fairness or patient safety, but despite their importance, these effects are rarely investigated because fully nested assessment designs make them very difficult to study. We describe initial use of a new method to measure and adjust for examiner-cohort effects on students' scores.

METHODS

We developed video-based examiner score comparison and adjustment (VESCA): volunteer students were filmed 'live' on 10 out of 12 OSCE stations. Following the examination, examiners additionally scored station-specific common-comparator videos, producing partial crossing between examiner cohorts. Many-facet Rasch modelling and linear mixed modelling were used to estimate and adjust for examiner-cohort effects on students' scores.

RESULTS

After accounting for students' ability, examiner cohorts differed substantially in their stringency or leniency (maximal global score difference of 0.47 out of 7.0 [Cohen's d = 0.96]; maximal total percentage score difference of 5.7% [Cohen's d = 1.06] for the same student ability by different examiner cohorts). Corresponding adjustment of students' global and total percentage scores altered the theoretical classification of 6.0% of students for both measures (either pass to fail or fail to pass), whereas 8.6-9.5% students' scores were altered by at least 0.5 standard deviations of student ability.

CONCLUSIONS

Despite typical reliability, the examiner cohort that students encountered had a potentially important influence on their score, emphasising the need for adequate sampling and examiner training. Development and validation of VESCA may offer a means to measure and adjust for potential systematic differences in scoring patterns that could exist between locations in distributed or national OSCE examinations, thereby ensuring equivalence and fairness.

摘要

背景

尽管对多个考官的判断进行平均可以减少客观结构化临床考试(OSCE)中不必要的总体评分变异性,但涉及 OSCE 多个平行电路的设计需要不同的考官群体共同以相同的标准评判表现,以避免偏差。先前的研究表明,在分布式或国家考试中可能存在重要的考官群体效应,这可能会影响公平性或患者安全,但尽管这些效应很重要,由于完全嵌套评估设计,很少对它们进行研究。我们描述了一种新方法的初步使用,以衡量和调整学生成绩中的考官群体效应。

方法

我们开发了基于视频的考官评分比较和调整(VESCA):10 个 OSCE 站中的 10 个站让志愿学生进行“现场”录像。考试后,考官另外对特定站的共同比较者视频进行评分,从而使考官群体之间产生部分交叉。多维 Rasch 模型和线性混合模型用于估计和调整学生成绩中的考官群体效应。

结果

在考虑到学生的能力后,考官群体在严格程度或宽松程度上存在显著差异(最大全球评分差异为 0.47 分(7.0 分制)[Cohen's d=0.96];不同考官群体对同一学生能力的总评分差异最大为 5.7%[Cohen's d=1.06])。对学生的全球和总百分比评分进行相应调整,改变了这两个衡量标准中 6.0%的学生的理论分类(要么通过到失败,要么失败到通过),而 8.6-9.5%的学生的分数至少被改变了 0.5 个学生能力标准差。

结论

尽管可靠性典型,但学生遇到的考官群体对他们的分数有潜在的重要影响,强调了需要充分抽样和考官培训。VESCA 的开发和验证可能提供了一种衡量和调整评分模式中可能存在的系统差异的方法,这些差异可能存在于分布式或国家 OSCE 考试中的不同地点,从而确保公平性和公平性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff33/6519246/908ad7402165/MEDU-53-250-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验