Giuliano Dominic A, McGregor Marion
J Chiropr Educ. 2014 Spring;28(1):16-20. doi: 10.7899/JCE-13-31. Epub 2014 Feb 27.
Objective : This study combined a learning outcomes-based checklist and salient characteristics derived from wisdom-of-crowds theory to test whether differing groups of judges (diversity maximized versus expertise maximized) would be able to appropriately assess videotaped, manikin-based simulation scenarios. Methods : Two groups of 3 judges scored 9 videos of interns managing a simulated cardiac event. The first group had a diverse range of knowledge of simulation procedures, while the second group was more homogeneous in their knowledge and had greater simulation expertise. All judges viewed 3 types of videos (predebriefing, postdebriefing, and 6 month follow-up) in a blinded fashion and provided their scores independently. Intraclass correlation coefficients (ICCs) were used to assess the reliability of judges as related to group membership. Scores from each group of judges were averaged to determine the impact of group on scores. Results : Results revealed strong ICCs for both groups of judges (diverse, 0.89; expert, 0.97), with the diverse group of judges having a much wider 95% confidence interval for the ICC. Analysis of variance of the average checklist scores indicated no significant difference between the 2 groups of judges for any of the types of videotapes assessed (F = 0.72, p = .4094). There was, however, a statistically significant difference between the types of videos (F = 14.39, p = .0004), with higher scores at the postdebrief and 6-month follow-up time periods. Conclusions : Results obtained in this study provide optimism for assessment procedures in simulation using learning outcomes-based checklists and a small panel of judges.
本研究结合了基于学习成果的清单和源自群体智慧理论的显著特征,以测试不同组别的评判者(多样性最大化组与专业知识最大化组)是否能够适当地评估基于人体模型的录像模拟场景。方法:两组各3名评判者对9段实习医生处理模拟心脏事件的视频进行评分。第一组对模拟程序有广泛的知识,而第二组在知识方面更为同质化且具有更强的模拟专业知识。所有评判者以盲法观看3种类型的视频(预汇报、汇报后和6个月随访),并独立给出评分。组内相关系数(ICC)用于评估评判者与组成员关系的可靠性。将每组评判者的评分进行平均,以确定组对评分的影响。结果:两组评判者的ICC均较高(多样性组为0.89;专家组为0.97),多样性组评判者的ICC的95%置信区间要宽得多。对平均清单评分的方差分析表明,在评估的任何类型的录像带中两组评判者之间均无显著差异(F = 0.72,p = 0.4094)。然而,视频类型之间存在统计学显著差异(F = 14.39,p = 0.0004),汇报后和6个月随访时间段的评分更高。结论:本研究获得的结果为使用基于学习成果的清单和一小批评判者进行模拟评估程序带来了乐观前景。