Pugh Debra, Halman Samantha, Desjardins Isabelle, Humphrey-Murto Susan, Wood Timothy J
a Department of Medicine , University of Ottawa, The Ottawa Hospital , Ottawa , Ontario , Canada.
b Department of Medicine, University of Ottawa, The Ottawa Hospital , and Department of Innovation in Medical Education, University of Ottawa , Ottawa , Ontario , Canada.
Teach Learn Med. 2016 Oct-Dec;28(4):406-414. doi: 10.1080/10401334.2016.1218337.
Construct: The impact of using nonbinary checklists for scoring residents from different levels of training participating in objective structured clinical examination (OSCE) progress tests was explored.
OSCE progress tests typically employ similar rating instruments as traditional OSCEs. However, progress tests differ from other assessment modalities because learners from different stages of training participate in the same examination, which can pose challenges when deciding how to assign scores. In an attempt to better capture performance, nonbinary checklists were introduced in two OSCE progress tests. The purposes of this study were (a) to identify differences in the use of checklist options (e.g., done satisfactorily, attempted, or not done) by task type, (b) to analyze the impact of different scoring methods using nonbinary checklists for two OSCE progress tests (nonprocedural and procedural) for Internal Medicine residents, and (c) to determine which scoring method is better suited for a given task.
A retrospective analysis examined differences in scores (n = 119) for two OSCE progress tests (procedural and nonprocedural). Scoring methods (hawk, dove, and hybrid) varied in stringency in how they awarded marks for nonbinary checklist items that were rated as done satisfactorily, attempted, or not done. Difficulty, reliability (internal consistency), item-total correlations and pass rates were compared for each OSCE using the three scoring methods.
Mean OSCE scores were highest using the dove method and lowest using the hawk method. The hawk method resulted in higher item-total correlations for most stations, but there were differences by task type. Overall score reliability calculated using the three methods did not differ significantly. Pass-fail status differed as a function of scoring methods and exam type, with the hawk and hybrid methods resulting in higher failure rates for the nonprocedural OSCE and the dove method resulting in a higher failure rate for the procedural OSCE.
The use of different scoring methods for nonbinary OSCE checklists resulted in differences in mean scores and pass-fail status. The results varied with procedural and nonprocedural OSCEs.
构建:探讨使用非二元检查表对参加客观结构化临床考试(OSCE)进展测试的不同培训水平的住院医师进行评分的影响。
OSCE进展测试通常采用与传统OSCE类似的评分工具。然而,进展测试与其他评估方式不同,因为来自不同培训阶段的学习者参加同一考试,这在决定如何评分时可能带来挑战。为了更好地衡量表现,在两项OSCE进展测试中引入了非二元检查表。本研究的目的是:(a)按任务类型确定检查表选项(如完成情况令人满意、尝试过或未完成)使用上的差异;(b)分析使用非二元检查表的不同评分方法对内科住院医师的两项OSCE进展测试(非程序性和程序性)的影响;(c)确定哪种评分方法更适合给定任务。
一项回顾性分析检查了两项OSCE进展测试(程序性和非程序性)的分数差异(n = 119)。评分方法(鹰派、鸽派和混合式)在对评为完成情况令人满意、尝试过或未完成的非二元检查表项目打分的严格程度上有所不同。使用这三种评分方法对每项OSCE的难度、信度(内部一致性)、项目总分相关性和通过率进行了比较。
使用鸽派方法时OSCE平均分数最高,使用鹰派方法时最低。鹰派方法在大多数站点的项目总分相关性较高,但因任务类型存在差异。使用这三种方法计算的总体分数信度没有显著差异。通过与否的状态因评分方法和考试类型而异,鹰派和混合式方法导致非程序性OSCE的不及格率较高,鸽派方法导致程序性OSCE的不及格率较高。
对非二元OSCE检查表使用不同评分方法导致平均分数和通过与否状态存在差异。结果因程序性和非程序性OSCE而异。